The course Data Science in Engineering gives an overview of modern methods of Data Science with a focus on process engineering applications. The basics of data pre-treatment such as multicollinearity, linear dependencies, imputation of missing values, anomaly detection, handling of outliers, methods for feature selection and extraction using for example Stepwise Variable Selection, Lasso, L1/L2 regularization and PCA are covered. Based on this, supervised machine learning methods are introduced to solve regression problems. Besides linear methods (Linear Regression, Lasso, Robust Regression), nonlinear methods such as Support Vector Regression, Gaussian Process Regression and Artificial Neural Networks are introduced. Furthermore, for dynamic data-driven modeling, recurrent neural networks are covered.
The methods are explained using examples from chemical engineering or process engineering and the examples are made available to the students. Software frameworks in Python will be used.
The examination takes the form of a semester assignment followed by an oral discussion. The content of the semester task is the application of the learned methods to a real data set from process engineering.