Syllabus
Introduction
The course Machine Learning and Data Analytics in Industrial Production presents pre-processing methods, supervised and unsupervised classifiers, reinforcement learning, as well as data analytics methods. Regarding the supervised and the unsupervised classifiers, both conventional and deep learning techniques are taken into account, the latter category including various types of deep Artificial Neural Networks (ANN), such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) suitable for Natural Language Processing (NLP). Explainable Artificial Intelligence (XAI) methods are introduced as well to conduct the students towards a more comprehensive and intuitive view of the automatic learning methods. All these techniques are exemplified in various situations and on various architectures, such as those specific to edge and fog computing devices, as well as in the Industrial Internet of Things (IIT) context.
The master students are assumed to carefully study and comprehend all the methods and algorithms, to be able to choose the most appropriate ones according to the practical situations and also to elaborate on original ones, the applicability domain being that of industrial production. During the laboratory sessions, the master students will gain specific technical skills and they will also enhance their programming skills, to be able to elaborate accurate and efficient solutions using Python technology. The quality of work, creativity, and innovation, as well as time management abilities, will be among the objectives of these classes.
The knowledge and skills acquired through this course will conduct the master students to bring innovation and efficiency to the industrial sector, to develop methods leading to the alignment with Industry 4.0, resulting also in a positive economic and social impact, by enhancing reliability, efficiency, and productivity.
Total Hours
This course unit covers 130 hours, from which 28 hours lectures, 28 hours lab work, and 74 hours individual study and work.
General Objective
The general objective of this course assumes to get reliable knowledge upon machine learning and data analytics, referring to data pre-processing techniques, including dimensionality reduction, to supervised and unsupervised classifiers, including both conventional and deep-learning approaches, to reinforcement learning, also to explainable artificial intelligence models. Recurrent Neural Networks (RNN) and Natural Language Processing (NLP) are also included among the topics.
Specific Objectives / Learning Outcomes
Specific objectives of the course on supervised and unsupervised classification techniques, predictions, NLP in industrial production, and their experimentation on appropriate architectures and infrastructures, including those specific to edge and fog computing, respectively in the Industrial Internet of Things context, include:
- Understand the basics of supervised and unsupervised classification techniques, predictions, and NLP in the context of industrial production.
- Understand the importance of appropriate architectures and infrastructures in the Industrial Internet of Things context.
- Be able to apply machine learning techniques and data analytics methods to improve industrial production systems.
- Be able to identify opportunities for automation and improvement in industrial production systems using the above-mentioned techniques.
- Be able to experiment with supervised and unsupervised classification techniques, predictions, and NLP on appropriate architectures and infrastructures in the Industrial Internet of Things context.
- Be able to evaluate the effectiveness of different machine learning techniques and data analytics methods for improving industrial production systems.
- Develop skills in data preparation, data cleaning, feature selection, and feature engineering for industrial production data.
- Develop skills in working with edge and fog computing infrastructures for processing and analyzing industrial production data.
- Develop skills in interpreting and visualizing the results of machine learning models and experiments in industrial production systems.
- Be able to communicate the results of machine learning experiments and their implications for industrial production systems to technical and non-technical audiences.
These specific objectives or learning outcomes aim to equip the participating students with the necessary knowledge, skills, and abilities to apply machine learning techniques and data analytics methods for improving industrial production systems and identifying opportunities for automation and improvement. By the end of the course, the students should be able to experiment with supervised and unsupervised classification techniques, predictions, and NLP on appropriate architectures and infrastructures in the Industrial Internet of Things context and evaluate the effectiveness of different machine learning techniques for industrial production systems. Additionally, they should be able to communicate their results and implications to both technical and non-technical audiences.
Professional Competencies
The professional competencies that a student can gain from this course unit include:
- Understanding and assimilation of the concepts, respectively of the algorithms that are specific to automatic learning and data analytics, related to data preprocessing, to the design and implementation of supervised and unsupervised classifiers, involving both conventional and deep learning techniques, specific methods of reinforcement learning, as well as data analytics
- Acquiring the ability to develop own modules and methods in the field of automatic learning, respectively data analytics
- Acquiring the ability to evaluate the specific methods of automatic learning, respectively data analytics
- The possibility of applying machine learning and data analytics methods on various architectures, such as those specific for edge and fog computing, respectively in the context of the Internet of Things
Cross Competencies
The course unit on Cognitive and Social Robotics for industrial production develops the following cross-competencies in students:
- Acquiring the ability to apply the methods and algorithms specific to automatic learning, respectively data analytics in the field of industrial production
- The ability to develop and implement appropriate automatic learning methods, respectively data analytics specific techniques in accordance with the objectives of the problem, and to integrate these methods into the industrial production systems.
Alignment to Social and Economic Expectations
Evaluation
Assessment methods
Regarding the lecture portion of the course on Machine Learning and Data Analytics in Industrial Production, the following assessment methods will be employed:
- Quizzes: In-class/online quizzes in order to test the students’ understanding of key concepts and theories covered in this course.
- Written assignments: Individual assignments that require students to apply their knowledge and skills to solve a real-world problem or case study, respectively to perform individual study concerning new concepts and algorithms in the approached domain.
- Midterm and Final Written Examinations: These examinations will consist of multiple choice, short answer, essay questions and specific problem solving by conceiving or applying appropriate algorithms. The final objective will be that of assessing the students’ overall understanding of the course material.
Concerning the laboratory part of the course, the following methods will be employed:
- Oral presentations: Students are required to present their laboratory work, as well as their individual assignments to the class, the assessment being based on the quality of their presentation skills, content, and interaction with the audience.
- Written or practical laboratory tests: Students will have to sustain at least one written or practical test at the end of the semester in order to prove their acquired knowledge and skills.
Assessment criteria
Regarding the course lectures, the assessment criteria are the following:
- Knowledge and understanding: Assessment of the students’ ability to understand the concepts and theory specific to the discipline, to identify and apply the most appropriate solutions according to the context.
- Analytical problem solving: Evaluate the students’ abilities of understanding complex problems, assessing different solutions and finally develop and implement the most appropriate methods.
- Communication skills: Assess the students’ abilities to express their ideas, respectively to explain the specific methods and algorithms in a clear and concise manner.
- Application of technology: Evaluate the students’ capacity to apply the appropriate methods and algorithms in the industrial production context.
For the laboratory work, the assessment criteria are provided below:
- Technical Skills: Assessment of the student’s ability to apply the knowledge and skills acquired during the course and laboratory hours in order to elaborate appropriate methods and algorithms and to apply them for solving specific problems in the domain of industrial production.
- Quality of Work: Assessment of the student’s ability to develop accurate and efficient methods in the domain of machine learning and data analytics with applications in industrial production.
- Creativity and Innovation: Evaluation of the student’s ability to think creatively and to elaborate original machine learning and data analytics methods with applications in the industrial production.
- Time Management: Assessment of the student’s ability to manage their time effectively and deliver completed work within the specified timeframe.
Quantitative performance indicators to assess the minimum level of performance (mark 5 on a scale from 1 to 10)
Quantitative performance indicators to assess the minimum level of performance (mark 5 on a scale from 1 to 10) for the lectures:
- Attendance and participation in class discussions – The student should attend at least 60% of the lectures and actively participate in class discussions.
- Homework and Quizzes – The student should complete all homework assignments and quizzes with a minimum score of 70%.
- Midterm Exam – The student should achieve a minimum score of 50% on the midterm exam.
Quantitative performance indicators to assess the minimum level of performance (mark 5 on a scale from 1 to 10) for the laboratory works:
- Laboratory attendance and participation – The student should attend and participate in all scheduled laboratory sessions.
- Laboratory reports – The student should submit all laboratory reports on time, with a minimum score of 60% on each report.
- Lab assignments – The student should complete all laboratory assignments with a minimum score of 70%.
- Laboratory tests – The student should achieve a minimum score of 50% on the each laboratory test.
Quantitative performance indicators for the final exam to assess the minimum level of performance:
- Completion of a minimum number of lecture-related questions correctly – 70% of the total questions.
- The student should be able to demonstrate an understanding of the basic concepts and theories related to machine learning and data analytics and their applications in industrial production, with a minimum score of 50% on multiple-choice questions or short answer questions.
- The student should be able to explain and analyze real-life case studies and their results, with a minimum score of 50%.
- The student should be able to demonstrate a basic knowledge of the techniques, algorithms, technologies, tools, and methodologies used in the domain of machine learning and data analytics, with a minimum score of 50%.
- The student should be able to apply the concepts and theories learned in the lectures to solve practical problems, with a minimum score of 50% on problem-solving questions.
- The student should be able to critically evaluate the machine learning and data analytics techniques and their adequacy to be applied in the industrial production, with a minimum score of 50% on essay questions.
- Display of critical thinking skills, as evidenced by the number of correct answers to questions requiring analysis and synthesis of information.
- Overall exam performance, measured in terms of the total number of correct answers and expressed as a percentage of the total exam score. A minimum score of 50% or above is set as the benchmark for a mark of 5.
Lectures
Unit 1. Introduction to Machine Learning and Data Analytics in Industrial Production (2 hours)
- Definition of machine learning and data analytics in the context of industrial production
- Types of machine learning: supervised, unsupervised, reinforcement learning
- Applications of machine learning in industrial production
- Data analytics methods and tools
Unit 2. Data Preprocessing and Feature Engineering (2 hours)
- Overview of data preprocessing and feature engineering
- Data cleaning techniques: removal of duplicates, treatment of missing values, handling outliers
- Feature selection techniques: correlation analysis, information gain, PCA
- Feature engineering techniques: feature scaling, discretization, normalization
Unit 3. Unsupervised Learning for Industrial Production (2 hours)
- Introduction to unsupervised learning in industrial production
- Clustering techniques: k-means, X-means, Expectation Maximization, hierarchical clustering
- Evaluation of clustering results: silhouette score, elbow method
Unit 4. Supervised Learning for Industrial Production (Part 1) (2 hours)
- Introduction to supervised learning in industrial production
- Conventional supervised classifiers: k-nearest-neighbor (k-NN), decision trees, Support Vector Machines (SVM), Multilayer Perceptron (MLP), regression
- Evaluation metrics: accuracy, precision, recall, F1-score
Unit 5. Supervised Learning for Industrial Production (Part 2) (2 hours)
- Introduction to ensemble methods in industrial production
- AdaBoost, bagging, boosting, stacking
- Multiclass classification: one-vs-rest, one-vs-one
Unit 6. Deep Learning for Industrial Production (Part 1) (2 hours)
- Introduction to deep learning in industrial production
- Convolutional Neural Networks (CNN): architecture, training, and application in image recognition
- Stacked Denoising Autoencoders (SAE): architecture, training, application in feature learning
Unit 7. Deep Learning for Industrial Production (Part 2) (2 hours)
- Introduction to Recurrent Neural Networks (RNN) with applications in Natural Language Processing (NLP)
- Long-Short Term Memory (LSTM) networks, Gated Recurrent Units (GRU)
- NLP tasks: sentiment analysis, named entity recognition
Unit 8. Model Evaluation and Selection (2 hours)
- Overview of model evaluation and selection in industrial production
- Performance metrics: accuracy, true-positive rate (sensitivity), true-negative rate (specificity), the area under ROC curve (AUC)
- Cross-validation techniques: k-fold, leave-one-out
Unit 9. Reinforcement Learning for Industrial Production (2 hours)
- Introduction to reinforcement learning in industrial production
- Markov Decision Processes (MDP)
- Q-learning, SARSA, Deep Q-Networks (DQN)
Unit 10. Explainable AI for Industrial Production (2 hours)
- Introduction to Explainable AI (XAI) in industrial production
- Techniques for explaining machine learning models: feature importance, partial dependence plots, SHAP values
- Case studies of XAI in industrial production
Unit 11. Real-time Data Analytics for Industrial Production (2 hours)
- Introduction to real-time data analytics in industrial production
- Technologies for real-time data processing: Apache Kafka, Apache Storm, Apache Flink
- Applications of real-time data analytics in industrial production: predictive maintenance, anomaly detection
Unit 12. Edge and Fog Computing for Industrial Production (2 hours)
- Introduction to edge and fog computing in the context of industrial production
- Differences between edge and fog computing
- Applications of edge and fog computing in industrial production: real-time data analytics, predictive maintenance, autonomous robots
Unit 13. Industrial Internet of Things (IIoT) and Predictive Maintenance (2 hours)
- Introduction to IIoT in industrial production
- Applications of IIoT in industrial production: asset tracking, predictive maintenance,
- Overview of predictive maintenance in industrial production
- Predictive maintenance techniques: condition monitoring, failure prediction, anomaly detection
- Case studies of predictive maintenance in industrial production using machine learning and data analytics
Unit 14. Future Trends in Industrial Production and AI (2 hours)
- Emerging technologies for industrial production and AI: blockchain, edge AI, neuromorphic computing
- Future trends in industrial production and AI: explainable AI, autonomous systems, human-robot collaboration
- Ethical and social implications of AI in industrial production
Lab Work
Unit 1. Setting up the development environment with Python and its libraries (NumPy, Pandas, Scikit-learn, TensorFlow, Keras, Pytorch) (2 hours classwork and 2 hours individual work)
Objective: Students will be able to set up and configure the development environment with Python and machine learning/deep learning libraries.
- Introduction to Python and its scientific computing libraries
- Installing and setting up the development environment
- Introduction to machine learning and deep learning libraries
Unit 2. Preprocessing and cleaning of real-world industrial data. Feature engineering for industrial data. Dimensonality reduction (2 hours classwork and 2 hours individual work)
Objective: Students will be able to preprocess and clean real-world industrial data, engineer appropriate features, and reduce dimensions to improve the quality of data for machine learning models.
- Data cleaning and preprocessing techniques
- Feature engineering techniques
- Dimensionality reduction techniques
Unit 3. Clustering of industrial data (2 hours classwork and 2 hours individual work)
Objective: Students will be able to apply clustering techniques on industrial data and evaluate the results using appropriate metrics.
- Introduction to clustering techniques for industrial data
- K-means, hierarchical clustering, DBSCAN
- Clustering evaluation metrics
Unit 4. Regression for industrial data. Classification for industrial data (2 hours classwork and 2 hours individual work)
Objective: Students will be able to apply regression and classification techniques on industrial data and evaluate the performance of the models.
- Introduction to regression and classification for industrial data
- Linear regression, logistic regression, support vector machines, decision trees, random forests
Unit 5. Ensemble methods for industrial data. Boosting algorithms for industrial data (2 hours classwork and 2 hours individual work)
Objective: Students will be able to apply ensemble methods and boosting algorithms on industrial data to improve the performance of the models.
- Introduction to ensemble methods for industrial data
- Bagging, boosting, stacking
- AdaBoost, Gradient Boosting, XGBoost
Unit 6. Building a CNN model for industrial image analysis (2 hours classwork and 2 hours individual work)
Objective: Students will be able to build and train a CNN model for industrial image analysis and apply transfer learning to improve the performance of the model.
- Introduction to CNNs and their applications in industrial image analysis
- CNN architecture and training
- Transfer learning
Unit 7. Building RNN and NLP models for industrial time series analysis (2 hours classwork and 2 hours individual work)
Objective: Students will be able to build and train RNN and NLP models for industrial time series analysis and extract insights from textual data.
- Introduction to RNNs and their applications in industrial time series analysis
- LSTM, GRU, Bidirectional RNNs
- Introduction to NLP and its applications in industrial production
Unit 8. Evaluation of machine learning models for industrial data (2 hours classwork and 2 hours individual work)
Objective: Students will be able to evaluate the performance of machine learning models using appropriate metrics and avoid overfitting and underfitting.
- Introduction to evaluation metrics for machine learning models
- Cross-validation, confusion matrix, ROC curve, precision-recall curve
- Overfitting and underfitting
Unit 9. Building a reinforcement learning model for an industrial control problem (2 hours classwork and 2 hours individual work)
Objective: Students will be able to build and train a reinforcement learning model for an industrial control problem and apply deep reinforcement learning techniques to improve the performance of the model.
- Introduction to reinforcement learning and its applications in industrial control
- Markov decision processes, Q-learning, SARSA
- Deep reinforcement learning
Unit 10. Building an explainable machine learning model for an industrial production problem (2 hours classwork and 2 hours individual work)
Objective: Students will be able to build an explainable machine learning model for an industrial production problem and interpret the model’s predictions.
- Introduction to explainable AI and its importance in industrial production
- Techniques for building explainable machine learning models
- LIME, SHAP values, feature importance, decision trees
Unit 11. Building a real-time data analytics pipeline for industrial production using Apache Kafka, Spark Streaming, and MLlib. Deploying the pipeline on a cloud-based platform (2 hours classwork and 2 hours individual work)
Objective: Students will be able to build a real-time data analytics pipeline using Apache Kafka, Spark Streaming, and MLlib, and deploy the pipeline on a cloud-based platform.
- Introduction to real-time data analytics in industrial production
- Apache Kafka and its use in streaming data processing
- Spark Streaming and MLlib for real-time data analytics and machine learning
- Deployment of the pipeline on a cloud-based platform (e.g., AWS, GCP, Azure)
Unit 12. Deploying a machine learning model on edge devices for industrial production (2 hours classwork and 2 hours individual work)
Objective: Students will be able to deploy a machine learning model on edge devices for industrial production and address the challenges and considerations involved in edge deployment.
- Introduction to edge computing and its use in industrial production
- Challenges and considerations for deploying machine learning models on edge devices
- Techniques for deploying machine learning models on edge devices
Unit 13. Collecting and processing sensor data from IIoT devices for Predictive Maintenance. Building a Predictive Maintenance model using Scikit-learn or other ML libraries. Evaluating the model’s performance (2 hours classwork and 2 hours individual work)
Objective: Students will be able to collect and process sensor data from IIoT devices, build a Predictive Maintenance model using Scikit-learn or other ML libraries, and evaluate the model’s performance.
- Introduction to predictive maintenance and its importance in industrial production
- Collecting and processing sensor data from IIoT devices
- Building a Predictive Maintenance model using Scikit-learn or other ML libraries
- Evaluation metrics for Predictive Maintenance models
Unit 14. Exploring emerging technologies for industrial production and AI through a case study or a demo (e.g., blockchain-based supply chain management, edge-based anomaly detection, quantum-inspired optimization) (2 hours classwork and 2 hours individual work)
Objective: Students will be able to understand and evaluate the potential benefits and challenges of using emerging technologies in industrial production and AI through case studies or demos.
- Introduction to emerging technologies for industrial production and AI
- Case studies or demos of emerging technologies (e.g., blockchain-based supply chain management, edge-based anomaly detection, quantum-inspired optimization)
- Evaluation of the potential benefits and challenges of using emerging technologies in industrial production
Supporting Infrastructure
To run the activity for this course unit, students will have the possibility to work in our labs with the following technologies:
- Computer having an i7 processor of 2.60 GHz, 8 or 16 GB of internal (RAM) memory, Nvidia Geforce GTX 1650 Ti GPU.