About OctopusCon: DS Talk
On November 12, after a one-year break, Quantum held an Octopus Conference dedicated to Data Science. The speakers were from Kharkiv IT companies Quantum and GlobalLogic. Michael Yushchuk, Head of Data Science at Quantum, moderated the event.
«For me, as the leader of the Data Science team, it is important that the guys regularly improve their presentation skills. I enjoyed participating in the review of my team’s reports and organizing the conference,» Michael comments.
OctopusCon was held in a mixed format: we managed to gather 30 participants offline, the rest of them joined us during the live broadcast. Please check the video from Octopus Con on our YouTube channel the speakers revealed.
At the conference, we tried to present something new for everyone from the Data Science community: those who just want to learn about new tools, hardcore researchers, and fans of solving practical puzzles.
It is powerful to note that the speakers talked about the tools and solutions they use for projects. According to our speakers, preparing a report, they structure the information about the project, notice project gaps, and growth areas.
At OctopusCon we spoke about the following topics:
Andrey Nesmyanovich, ML Engineer at Quantum, talked about Kubeflow technology and whether it is worth using it in work.
Andrey gave examples about the Kubeflow platform:
— Kubeflow is a fit for enterprise projects.
— When using Kubeflow, you should have a relatively large project with several pipelines, where there is a clear separation by tasks and resources that go to them.
— The project should have an ML bias: it can be training the model or its inference.
The participants also learned about the main advantages of Kubeflow compared to other workflow management platforms.
Vlad Khramtsov, Data Science Engineer at Quantum, talked about testing web applications using Imitation Learning.
In the speech, Vlad demonstrated the ability of machine learning to conduct automated testing of web applications only using screenshots. Imitation learning is the main idea of the approach.
The participants learned that:
— Web testing relies heavily on non-automated methods.
— After collecting the necessary data, you can train a simple CNN model to reproduce tests on a website.
— It is possible to teach an ML agent to perform >90% of actions on websites using screenshots alone!
«Introduction to Learning to Rank» by Klim Yamkovoy, Data Science Engineer at GlobalLogic.
Klim’s speech was dedicated to Learning to Rank and Semi-Supervised learning. Klim spoke about LR approaches and key documents in this area:
— Training with partial reinforcement should be used when we have both marked-up and unmarked data. Usually, there are many more second ones.
— This is a valuable tool in conditions of a lack of marked-up data, increasing the model’s accuracy without requiring additional markup.
— Due to its limitations, the tool cannot completely replace the data markup.
— Ranking training is used in various areas of life, but it is not as popular in the community as in other areas of ML. The main distinguishing feature is the target metrics and datasets.
Studies on the adaptation of the LambdaRank model to supervised learning (SSL) were also presented.
To not miss the news from OctopusCon, consider subscribing to our Facebook and be the first to learn about our events!