Master thesis project in: Industry classification using machine learning models - Intern

ING | Amsterdam | NL

Welcome to the Latest Job Vacancies Site 2025 and at this time we would like to inform you of the Latest Job Vacancies from the ING with the position of Master thesis project in: Industry classification using machine learning models - Intern - ING which was opened this.

If this job matches your qualifications, please send your application directly through our latest Job site. Indeed, every job is not easy to apply because it must meet several qualifications and requirements that we must meet in accordance with the standard criteria of the Company who are looking for potential candidates to work. Good job information Master thesis project in: Industry classification using machine learning models - Intern - ING below matches your qualifications. Good Luck: D

...

Summary

Research and develop a machine learning model that encodes the industry that a client firm is working in.

Business context

Industry, or sector, is one of the most prominent features of firms, used to group them for business analytical purposes (reports, dashboards), and used in (machine learning) models for predictions. Industry is also among the better-known features of our clients; we have a so called NAICS codes (North American Industry Classification System) assigned to close to 100% of our clients. However, for other firms that do not bank with ING we do not have an industry classification.

Project context

We would like to have a model that classifies the industries that a firm operates in.

Who a firms buyers and suppliers are, is largely dictated by the industry that the firm operates in. A hotel is unlikely to receive large sums from a bakery. At ING we see our clients buyers and suppliers, but we only know the buyers and suppliers industries if they in turn they are ING clients. We thus have only partial information of these business partners, typically more complete for the smaller businesses, at least within NL.

The model should consider the industries of the buyers and suppliers, and potentially how much is payed to/from them and infer the industry of the firm. It can be trained on our own clients, for which we know the industry, and then applied to external firms to estimate their industries.

Where classical machine learning problems have a fixed set of features as input, this case (initially) does not: every firm has a different number of buyers and suppliers. Furthermore, there is no order in these; one supplier does not “go before” another. The model needs to be able to deal with this. An easy solution is to embed the buyer and supplier industries into a TF-IDF type vector, but other solutions may be out there.

There are degrees of being wrong. If a firm is a pig farm, but the model classifies it as a sheep farm, then the model is less far off than when it classifies it as an electric power generation company. NAICS codes luckily contain a hierarchy that can be used to assess how far off the model is: the first two digits give the sector, the third digit the subsector, the fourth the industry group, etc. up to six digits.

Research tasks

  • Perform literature research into methods that can handle variable length and unordered inputs
  • Choose a number of models that are suitable for the task, the input, the data size, etc. These should include but dont have to be limited to well-known methods such as logistic regression, boosted decision trees, neural networks, k-means or geometric models (e.g. node2vec)
  • Construct one or more learning objectives that incorporate degrees of wrongness
  • Construct a number of metrics that measure at different levels of wrongness
  • Create a pipeline with which you can train and test different combinations of models, objectives
  • Error analysis: Search for specific characteristics for which a model gets the industry more or more often wrong. E.g. when there are only a few buyers or suppliers, for certain buyers or suppliers, for little money spent or received, etc. This important step can help you improve the model, but also helps us understand how reliable the predicted industry is for different characteristics.
  • Compare models and objectives, perform error analysis, iterate and improve
  • Apply the best competing model(s) to our own clients to spot firms that may have been wrongly labeled, or that may operate in multiple industries

Research goals

  • Compare various models and objectives for the task
  • Discuss the best competing model(s) for its assets (where it performs well) and liabilities (when it performs badly)
  • Document all methods, models, tests and results in reproducible documentation
  • Suggest clients that may have been wrongly labeled, or that may operate in multiple industries

Information :

  • Company : ING
  • Position : Master thesis project in: Industry classification using machine learning models - Intern
  • Location : Amsterdam
  • Country : NL

How to Submit an Application:

After reading and knowing the criteria and minimum requirements for qualifications that have been explained from the Master thesis project in: Industry classification using machine learning models - Intern job info - ING Amsterdam above, thus jobseekers who feel they have not met the requirements including education, age, etc. and really feel interested in the latest job vacancies Master thesis project in: Industry classification using machine learning models - Intern job info - ING Amsterdam in 2025-01-17 above, should as soon as possible complete and compile a job application file such as a job application letter, CV or curriculum vitae, FC diploma and transcripts and other supplements as described above, in order to register and take part in the admission selection for new employees in the company referred to, sent via the Next Page link below.

Next Process

Attention - In the recruitment process, legitimate companies never withdraw fees from candidates. If there are companies that attract interview fees, tests, ticket reservations, etc. it is better to avoid it because there are indications of fraud. If you see something suspicious please contact us: support@jobkos.com

Post Date : 2025-01-17