Projects 2023-2024
The History of Machine Learning
Do you remember the times when websites used to look like this? https://icml.cc/Conferences/2002/
AI and machine learning have begun to revolutionize the world in many ways. From writing, to self-driving, to healthcare. Strangely enough, all of this did not exist less than 10 years ago. The field of Artificial intelligence has existed for less than 100 years, with major advances only happening in the last 10. What was going on 20 years ago? What about 15 years ago? How on Earth did we get to a point when it’s not even me writing this text?
Exploring this question will be the main activity of the BAINSA Research division during Semester I. We will start from year 2002, implementing basic KNN to segment satellite images and go all the way to 2023, utilizing ChatGPT for image editing. Every week, our theme will be the next year, in chronological order.
Should you join the Research division, you will frankly experience a perspective on AI that very few people (even professionals) would have on the field. In short, the main project of BAINSA Researchers will be to create a time-machine GitHub repository: 20 projects for 20 years of AI. View our articles hereCryptocurrencies Forecasting with News and Social Media
The goal of the project is to assess the predictive power of news and social media sentiment on cryptocurrency prices. This project has elements of data science (sentiment analysis), financial modeling, and computer science.Introduction, Goal, and Motivation
The goal of this project is to analyze the predictive power of news and social media on cryptocurrency prices.
Cryptocurrencies are digital assets whose transactions are validated by a distributed ledger, i.e. a network of computers that share the same history of transactions (the blockchain) and that agree on how to add a new transaction to the blockchain (consensus mechanism).
This validation structure makes cryptocurrencies stand out from traditional types of assets because they are decentralized: they are not controlled by a central entity (such as bank).
The promise of decentralization has excited many retail investors around the world which resulted in the gain in popularity of cryptocurrencies, rise in value, and subsequent legitimization of the asset from institutional investors and governments.
In the last few years, the crypto space properly boomed, with applications in a variety of industries such as finance, art, and gaming (DeFi, NFTs, web3).
Given the nascent stage of this market, cryptocurrencies are highly speculative and government regulation uncertain. This makes their price highly volatile and susceptible to sudden shifts in public and government opinion.
For this reason, two interesting places to look for information that can help us predict the change in the price of cryptocurrencies are social media and news.
Hence, the goal of this project is to assess exactly this! To what extent are cryptocurrency prices predictable by examining the sentiment of news and social media posts?
Team: Federico Russo, Matteo Bergsagel, Csilla Lelle Janky, Dario Filatrella, Ali Emre Senel.
The project is developed in collaboration with Bending Spoons.
Projects 2022-2023
BAINSA research team implemented the following projects during the 2022/23 academic year:The Wikipedia Knowledge Graph
The goal of this project is to create a knowledge graph of the English Wikipedia. Given the extent and exhaustiveness of Wikipedia, creating such a knowledge graph would come arguably close to representing the skeleton of the entirety of human knowledge!
This project has elements of graph theory, computer science (such as storage and search), and machine learning (natural language processing).The previous paragraph from Wikipedia is from the page Graph (discrete mathematics). There are numerous hyperlinks that take the reader to different related pages. Each page can be thought of as a “Node” of the graph, and each hyperlink can be seen as an “Edge” in between nodes. In this case, “Graph” connects to “Mathematics”, “Graph Theory”, “Vertices”, and “Diagrammatic Form”. There are many types of graphs: undirected, cyclic, DAGs, mixed…
The term “knowledge” in “knowledge graph” suggests that this structure represents knowledge and the relationships between concepts.
Applications are, but not limited to: quickly querying related topics to a central topic of interest and measuring the dissimilarity of topics based on their distance.
Given the extent and exhaustiveness of Wikipedia, creating such a knowledge graph would come arguably close to representing the skeleton of the entirety of human knowledge.
Even if this is already a super cool achievement, knowledge graphs can go beyond: if edges are given “types”, we can define the type of relationship between nodes. For example:
This would result in effectively mapping not only the nodes and relations but also how each node is related to another. A graph of this form is what Google introduced in 2012 to answer our search queries, abandoning the string-matching method.
Team: Matei Cosa, Keshav Ganesh, Kristian Gijka, Pavle Lalić.
Supervisor: Prof. L. Saglietti.
Computational neuroscience: MICrONS Explorer
The project is built upon the Machine Intelligence from Cortical Networks (MICrONS) program that seeks to revolutionize machine learning by reverse-engineering the algorithms of the brain [1]. The work consists of using computer programming and machine learning tools to analyze data from a functional connectomics dataset that contains calcium imaging of an estimated 75,000 neurons from the mouse visual cortex [2].The team uses computational tools such as Python, Datajoint framework and Jupyter network to analyze a dataset based on a section of a mouse brain, created by Machine Intelligence from the Cortical Network program. The dataset contains different types of data, such as electron microscope imaging, cell segmentation, cell meshes. The main focus of the project is analyzing the response the rodent brain produces after various visual stimuli.
Team: Riley Scott Jenum, Barbara Karwowska, Emma Anastassova, Alessandro Pagan.
Supervisor: Prof. A. Sanzeni.
References: [1] https://www.microns-explorer.org/; [2]https://www.biorxiv.org/content/10.1101/2021.07.28.454025v2.full
Machine Learning for Financial Statement Analysis
The goal of the projects is to optimize Chaptr income-tracking capabilities for graduates by fitting a machine learning model to better identify misrepresentation of income based on Mpesa inflows and outflows reported by the users. Chaptr is an education financing venture that works with technical schools to allow African students to learn now, and pay later through deferred tuition payment options such as income-share agreements and flexible fee payments [1].Mpesa is the most predominant form of payment in Kenya, outranking even banks. Students using Chaptr’s services are expected to submit monthly reports of their income from Mpesa.
The goal of this project is to flag misrepresentation in Mpesa statements and as such vet one’s monthly recurring income accurately.
The team uses tools such as feature engineering to run explaratory data analysis and model statement data.
Team: Michał Krzyżański, Emil Mollov, Kassym Mukhanbetiyar, Noam Etten.
The project is developed in collaboration with Chaptr Global.
References: [1] Chaptr Global.