Data Flows
These are the steps in the flow of the data for each feature from our requirements:
1. User Ability to Tweet
a. Tweet Creation: When a user composes a tweet, the client sends a request to the server containing the tweet content and associated metadata.
b. Data Processing: The server validates the request, processes hashtags, mentions, and other elements within the tweet.
c. Database Interaction: The tweet is then stored in the Tweets
table in the database, linked to the user's primary key in the Users
table.
d. Cache Update: The tweet is also cached using systems like Redis to enhance retrieval speed.
e. Fan-Out: The tweet is propagated to the followers' home timelines, possibly using a fan-out caching approach.
f. Acknowledgment: An acknowledgment is sent back to the user, confirming the successful posting of the tweet.
2. User Ability to Follow People
a. Follow Request: The user initiates a follow request for another user.
b. Relationship Establishment: The server processes the request and updates the Followers
table, establishing a relationship between the follower and followee.
c. Timeline Update: The home timeline of the follower is updated to include the followee's tweets.
d. Notification: Optionally, a notification may be sent to the followee.
3. User Ability to See Their Own Timeline
a. Timeline Request: The user requests to view their timeline.
b. Cache Retrieval: The server first looks in the cache (e.g., Redis) to quickly retrieve recent tweets.
c. Database Query: If necessary, the server queries the Tweets
table for additional tweets.
d. Response: The tweets are chronologically arranged and sent back to the user.
4. User Ability to See a Home Timeline
a. Home Timeline Request: The user requests their home timeline, displaying tweets from people they follow.
b. Cache and Database Interaction: The server retrieves relevant tweets from both the cache and the database.
c. Aggregation and Sorting: Tweets are aggregated from various followed users and sorted chronologically.
d. Response: The sorted tweets are sent back to the user's client for display.
5. User Ability to Search with Internal Search Engine
a. Search Query: The user enters a search query, possibly including hashtags or keywords.
b. Distributed Search: The query is sent to multiple data centers and Earlybird shards.
c. Index Lookup: The search engine utilizes reverse indexing to find matching tweets.
d. Ranking and Sorting: Results are ranked based on popularity and relevance.
e. Response: The final sorted results are returned to the user.