Automating large-scale data quality verification S Schelter, D Lange, P Schmidt, M Celikel, F Biessmann, A Grafberger Proceedings of the VLDB Endowment 11 (12), 1781-1794, 2018 | 272 | 2018 |
Probabilistic demand forecasting at scale JH Böse, V Flunkert, J Gasthaus, T Januschowski, D Lange, D Salinas, ... Proceedings of the VLDB Endowment 10 (12), 1694-1705, 2017 | 175 | 2017 |
DataWig: Missing value imputation for tables F Biessmann, T Rukat, P Schmidt, P Naidu, S Schelter, A Taptunov, ... Journal of Machine Learning Research 20 (175), 1-6, 2019 | 120 | 2019 |
" Deep" Learning for Missing Value Imputationin Tables with Non-numerical Data F Biessmann, D Salinas, S Schelter, P Schmidt, D Lange Proceedings of the 27th ACM international conference on information and …, 2018 | 94 | 2018 |
Extracting structured information from Wikipedia articles to populate infoboxes D Lange, C Böhm, F Naumann Proceedings of the 19th ACM international conference on Information and …, 2010 | 93 | 2010 |
Cross-lingual entity matching and infobox alignment in Wikipedia D Rinser, D Lange, F Naumann Information Systems 38 (6), 887-907, 2013 | 68 | 2013 |
Efficient similarity search in very large string sets D Fenz, D Lange, A Rheinländer, F Naumann, U Leser Scientific and Statistical Database Management: 24th International …, 2012 | 36 | 2012 |
Automated data validation in machine learning systems F Biessmann, J Golebiowski, T Rukat, D Lange, P Schmidt | 35 | 2021 |
Unit testing data with deequ S Schelter, F Biessmann, D Lange, T Rukat, P Schmidt, S Seufert, ... Proceedings of the 2019 International Conference on Management of Data, 1993 …, 2019 | 26 | 2019 |
Deequ-data quality validation for machine learning pipelines S Schelter, P Schmidt, T Rukat, M Kiessling, A Taptunov, F Biessmann, ... | 25 | 2018 |
Differential data quality verification on partitioned data S Schelter, S Grafberger, P Schmidt, T Rukat, M Kiessling, A Taptunov, ... 2019 IEEE 35th International Conference on Data Engineering (ICDE), 1940-1945, 2019 | 21 | 2019 |
Reach for gold: An annealing standard to evaluate duplicate detection results T Vogel, A Heise, U Draisbach, D Lange, F Naumann Journal of Data and Information Quality (JDIQ) 5 (1-2), 1-25, 2014 | 21 | 2014 |
Efficient Similarity Search: Arbitrary Similarity Measures, Arbitrary Composition D Lange, F Naumann Proceedings of the 20th ACM international conference on Information and …, 2011 | 19 | 2011 |
Frequency-aware similarity measures: why Arnold Schwarzenegger is always a duplicate D Lange, F Naumann Proceedings of the 20th ACM international conference on Information and …, 2011 | 16 | 2011 |
Towards automated data quality management for machine learning T Rukat, D Lange, S Schelter, F Biessmann ML Ops Work. Conf. Mach. Learn. Syst, 1-3, 2020 | 12 | 2020 |
Towards automated ml model monitoring: Measure, improve and quantify data quality T Rukat, D Lange, S Schelter, F Biessmann | 9 | 2020 |
An interpretable latent variable model for attribute applicability in the amazon catalogue T Rukat, D Lange, C Archambeau arXiv preprint arXiv:1712.00126, 2017 | 5 | 2017 |
Cost-aware query planning for similarity search D Lange, F Naumann Information Systems 38 (4), 455-469, 2013 | 5 | 2013 |
Effective and efficient similarity search in databases D Lange Universitätsbibliothek der Universität Potsdam, 2013 | 3 | 2013 |
Scalable similarity search with dynamic similarity measures M Köppelmann, D Lange, C Lehmann, M Marszalkowski, F Naumann, ... Proceedings on the International Workshop on Ranking in Database, 2012 | 2 | 2012 |