Introduction
The project is designed for students to create cloud-based applications using microservices, with a focus on serverless architecture for data analytics. Students will choose a data-related challenge, develop microservices to address it, and integrate those into a cohesive application. The project responds to the growing need for advanced data analytics software and provides practical experience in building solutions that data scientists across various industries might employ. It emphasizes the development of microservices with a serverless approach to facilitate scalable and efficient data analysis.
Target Application
There is no doubt that many businesses now use sophisticated systems for decision-making but their great limitation is in finding and accessing good quality data. In this project, we assume a non-profit organisation that will use students to build an Event Intelligence Application using agile methods combined with cloud and DevOps principles so that different components can be built progressively over time. Teams will be following agile processes to build the system incrementally and comply with some design recommendations.
...
In summary, the Event Intelligence Application is aimed at users interested in acquiring event datasets that help them do further processing (e.g., investigate some hypothesis, visualise events on a dashboard, make predictions etc.). As many users can have overlapping needs, the company decides to gather as much event data as possible from different sources and get them managed by dedicated microservices. This way, the overall Event Intelligence Application automates the process of gathering raw data from several data sources and analysing the data using data processing pipelines made up of reusable components.
Design Guidelines
Architecture
The software architecture diagram provided outlines a system designed around key principles to ensure scalability and independence. Here is a high-level concept:
...
Data Collection: This is the foundational microservice that acquires data from various external sources and standardizes it to fit a specified data model.
Data Retrieval: This service is tasked with fetching the stored data from cloud infrastructure, ensuring it is readily accessible for subsequent operations.
Data Preprocessing: A crucial step where the retrieved data undergoes cleansing and formatting, setting the stage for accurate analysis.
Analytical Model: Utilizing the preprocessed data, this microservice applies statistical models and algorithms to distill insights and patterns.
Visualization/Reporting: The culminating microservice that presents analytical findings through visual aids or reports, facilitating easier interpretation and decision-making.
The Microservices and API
For the project, students are required to develop several distinct microservices tailored for data analytics. Each microservice must be capable of communicating through APIs, both for internal functionalities and external interfaces. These APIs will eventually be utilized and tested by other teams, which necessitates that each microservice is built to operate independently. This modular design allows for any of the microservices to be substituted with alternative implementations in the future if needed. Students will put their APIs on a marketplace for other teams to use it. For the final application, Student will get higher marks if more teams use your APIs, or you can integrate more other teams’ APIs into your application meaningfully.
...
Set5: Analytical Model + Visualization/Reporting
The Data Model
For the project, students are encouraged to download or access on-line open-source datasets within finance, ESG, economy, news & social network, climate, and public health related areas. Each student is required to select one (or more) of these datasets as the basis for their microservices application.
In adherence to the project guidelines, it is imperative that all event data is stored using a consistent format. This format is detailed in the data model document, accessible through the provided link. The rationale behind maintaining uniform data formatting is to enable seamless sharing of data analytics services among different teams, regardless of the data source. This standardized approach enhances collaboration and interoperability across the project. Details can be found in page:
25T1-Data Model Specification - SENG3011 25T1 - UNSW SENG
The Tech Stack
The recommended technology stack for this project is centered around Amazon Web Services (AWS), and it includes various tools and services that students will utilize:
...
Variations are allowed by teams of students.
Sprints Overview
In the project, students will complete three sprints, each with specific objectives:
...