Why Data Engineers need Hacker Skills
Data Engineering is not just about enterprise-internal data but also about data gathering from external sources and data security.
Data Engineering is not just about enterprise-internal data but also about data gathering from external sources and data security.
A fuzzy matching was used to combine the data from the two different sources. A selection of fuzzy string-matching algorithms was tested, for example Jaro-Winkler Distance, Levenshtein distance, Soundex or cosine similarity. The open-source algorithms can be very efficient and there is a selection to choose from depending on the use case.
Big data already plays a decisive role in almost all industries and has become an elementary earnings factor. Nevertheless, only a few companies have a mature data strategy in order…