Posts

TODOs - Mar 16 2018

Frontend (Twittermap)  - Right sidebar is hidden by scrollbar  - Pinmap left coner tool box needs to be updated or removed  - Popup Tweet content by talking to our own DB - Qiancheng is working  - Integrate multi-layered frontend feature  - Enabling time series bar data caching Bugs  - First time search for a new keyword, the slicing queries might be stuck as the second mini-query.  - "Future timeout" exception (where and when is not known now, sometimes when starting the TM) Features  - New 47000 size Sample Data (with user.pic_url attribute) is needed for Quick-Start guide environment  - HIVmap should be merged into our Github repository in some way.  - Supporting "Geocell group by" queries with arbitrary scale (currently is fixed) as parameter in Cloudberry  - A Guard monitoring our whole system's status  - Randomly generate coordinate for each tweet which does not have coordinate value during geotagging preprocessing Other

Meeting 2018-02-02

Future Tasks - Changes are noted by  Bold Reconstruct AsterixDB install guide  -  Qiushi Maintenance jobs - Qiushi Frontend refactoring - Teyu Take Multiple-Color-Pointmap as start point - Aaron "geoCell" & "getPoint" function - Qiushi Wrap the conf file parameters loaded by Twittermap  - Qiancheng Tableau Evaluation - larger data set - Li Deng Goes down to performance issue, pulling data from AsteridDB, like Nik's work - Yang Cao Comparison with Oracle - more experiments - Jiliang Ni Achievements Solve the AsterixDB installation problem   -  Taewoo Run Oracle and import Tweets data into Oracle - Jiliang Ni Load different layers of data into UI - Yang Cao Show 2 colors of points on map - Aaron Notes Rename "Point Map" to "Pin Map" - suggestion Build specific keywords (e.g. "HIV") TwitterMap instances - Colaboraters

Meeting 2018-01-26

Future Tasks - Changes are noted by  Bold Solve the installation problem  Reconstruct AsterixDB install guide  - Qiushi, Chen, Taewoo Frontend refactoring - Teyu Take Multiple-Color-Pointmap as start point - Aaron "geoCell" & "getPoint" function - Qiushi Wrap the conf file parameters loaded by Twittermap - Qiancheng Tableau Evaluation - larger data set  - Li Deng Layers of Frontend - Yang Cao Comparison with Oracle - Jiliang Ni Achievements Generate "CountMap" and "PointMap" on Tableau - Li Deng Successfully imported tweets data to Oracle - Jiliang Ni Show tweets on layered UI experiment - Yang Cao Stop words merged - Qiancheng PointMap Parameter Configuration merged - Qiushi Notes 1.48M records of Sample TwitterData (2017-Jul-01~07). (.zip) 382MB --> (.json) 1.6GB 14.6M records of Sample TwitterData (2017-Jul-01~Sep-01). (.zip) 4GB --> (.json) 16GB

Meeting 2018-01-19

Future Tasks - Changes are noted by Bold Frontend refactoring - Teyu Take Multiple-Color-Pointmap as start point - Aaron Stop words merge - Qiancheng PointMap Parameter Configuration merge - Qiushi "geoCell" & "getPoint" function - Qiushi New middleware task finding for Qiancheng - Chen & Qiushi Tableau Evaluation - Li Deng Layers of Frontend - Yang Cao Comparison with Oracle - Jiliang Ni Notes:  For people who need some sample data to play with, here is the link of 47000 records of Twitter data . Data is `json` format: The whole file is an array wrapped by []: [ {tweet1}, {tweet2} ] Each line/element (delimited by ",") is one tweet. Example: "tweet1" -  { "create_at": "2016-01-01T00:00:00.000Z", //Timestamp the tweet was created. "id": XXXXXXXXXXXX, "text": "text content of this tweet", ... "user": {sub-object, basic informat

Meeting 2018-01-12

Independent Tasks for new members (1) Tableau Evaluation     - Related to this blog ;     - Tableau website: https://www.tableau.com/     - Instructions:         - Install the software         - Put some sample data into it         - Play around with its features and UIs, get some sense of its power         - Look deeper into it and try to translate what kinds of queries can Tableau answer and evaluate how fast it can reply (2) Layers of Frontend     - Take Google Maps as an example, it allows realtime add or remove different layers of dataset: e.g. Terrain, Traffic, Bicycling routes, Satellite, etc.     - Explore how to implement a frontend with such features of realtime adding and removing layers of different datasets     - Instructions:         - Find some techniques or libraries to do this         - Do a toy demo to get some sense of how it works and how to implement this         - Then we will see how to implement this based on our current codebase (3) Compariso

TODOs - Mar 13 2017

Work on open issues Try other frontend (e.g. Vega, Tableau, ParaView , SuperSet ) Try other backend (e.g. MongoDB, Druid) Rebuild the current demo using other solutions, e.g. Mongo, Spark (for comparison purpose) Integrate Flickr image with TwitterMap Approximate Answering using Stats Support continuous query (in a space of the middleware) Middleware cache High available middleware cluster Stress test of the cluster using a high-ingestion rate Read the paper titled " I4E: interactive investigation of iterative information extraction " Some ongoing work ** Implement the Aggregation views ** Sentiment analysis for restaurant domain

[Task] Take a deeper look into "Tableau"

Why Cloudberry is now in a new stage of extending its power to accelerate general expensive OLAP queries. Therefore, how to design the CB language is a big problem. What Define the scope of queries . To design the CB language is to give formal descriptions of queries that we should answer. Therefore, the first thing to know is what kinds of queries we will answer. How Take a deeper look into " Tableau " to see what kinds of queries are sent to its backend, or say what kinds of queries are answered by " Tableau" .