Datathon [23-26 June 2021] (Virtual)
The target of following challenges is to analyze the NYC Yellow Taxi Trip dataset
(TAXI) and provide novel interactive visualization methods and visual analytics for the following analytical tasks.
For more info see: Datathon home page
Challenge 1: Visual Analytics for Traffic Prediction
Note. The data prediction and visualization are considered to be used in interactive applications, so real-time response is required. The focus of this task is on novel visualization methods that will allow users to use\run well-known prediction algorithms (e.g., regression) and interact with the predicted results.
Visualize Taxi records on a geographic map.
Analyze the taxi pick-up data to predict traffic for specific "areas", given a future day and time.
Areas can be split in tiles using the "H3 Uber Grid Splitting" method, or your preferable approach.
Propose and implement innovative techniques, and visually represent "information" regarding traffic prediction based on taxi historical data in the specified areas. As naïve solution, different colors and charts can be used to represents the traffic over the areas and for different features, e.g., day of week, time of day, POIs in the area, taxi company, no of passengers.
Challenge 2: Visual Analytics for Dirty Mobility Data
Taxi and Other Mobility companies aggregate data from various sources, ending in having dirty\duplicate entries related to trips. A common task in such cases involves an analyst which wishes to analyze the dataset w.r.t. data quality. The use of visual techniques may reveal information (e.g., correlations, patterns) which are not easily captured by traditional (non-visual) methods. In our case, visually analyze "information" related to duplicate entries, will assist the analyst to recognize data patterns, values, or specific attributes, where duplicates records appear. Beyond the insights related to data quality, using these insights will enable the analyst to improve the effectiveness of their duplication techniques.
Note. The duplicates records will be given to the participants beforehand.
Visualize duplicate records on a geographic map.
Propose and implement new visual methods to visually represent duplicates records on a map. As an example, you may connect the duplicates records with lines. For example, you can see the duplicate information presented in RawVis tool.
Propose and implement techniques that visual provide "information" and present statistics related a set of duplicates records. So, a user can gain insight regarding duplicate records characteristics, e.g., which are the most common attributes where duplicates values appear.
Note. The data prediction and visualization are considered to be used in interactive applications, so real-time response is required.