Problem statement The Immigration and Nationality Act (INA) of the US permits foreign workers to come to the United States to work on either a temporary or permanent basis. The act also protects US workers against adverse impacts on the working place and maintains requirements when they hire foreign workers to fill workforce shortages. The immigration programs are administered by the Office of Foreign Labor Certification (OFLC).
OFLC gives job certification applications for employers seeking to bring foreign workers into the United States and grants certifications.
As In last year the count of employees was huge so OFLC needs Machine learning models to shortlist visa applicants based on their previous data.
In this project we are going to use the data given to build a Classification model:
This model is to check if Visa get approved or not based on the given dataset.
This can be used to Recommend a suitable profile for the applicants for whom the visa should be certified or denied based on the certain criteria which influences the decision.
Life cycle of Machine learning Project
Understanding the Problem Statement
Data Collection
Exploratory data analysis
Data Cleaning
Data Pre-Processing
Model Training
Choose best model
Models that were tried out:
Random Forest
Decision Tree
Gradient Boosting
Logistic Regression
K-Neighbors Classifier
XGBClassifier
CatBoosting Classifier
Support Vector Classifier
AdaBoost Classifier
Components of the training pipeline
Data Ingestion
Data Validation
Data Transformation
Model Trainer
Model Evaluation
Model Pusher
Deployment
1) Problem Statement, EDA, Hyperparameter tuning and Model training Watch the youtube video on my channel explaining this part of the project
2) MongoDB & Folder Structure of the project Watch the youtube video on my channel explaining this part of the project
3) Custom logging & Exception Watch the youtube video on my channel explaining this part of the project
4) Data Ingestion Flowchart
Watch the youtube video on my channel explaining this part of the project
5) Data Validation Flowchart
Watch the youtube video on my channel explaining this part of the project
6) Data Transformation Flowchart
Watch the youtube video on my channel explaining this part of the project
7) Model Trainer Flowchart
Watch the youtube video on my channel explaining this part of the project
8) AWS storage & Model Evaluation Flowchart
Watch the youtube video on my channel explaining this part of the project
9) Model Pusher Flowchart
Watch the youtube video on my channel explaining this part of the project
10) Training & Prediction Pipelines Watch the youtube video on my channel explaining this part of the project
11) App Design Watch the youtube video on my channel explaining this part of the project
12) Docker Watch the youtube video on my channel explaining this part of the project
13) AWS EC2 & Self Hosted Runner Watch the youtube video on my channel explaining this part of the project
14) CICD pipeline & Deployment Watch the youtube video on my channel explaining this part of the project