Cool Hack Ideas That Made To The Finals of our Hackathon VizHack 2015
Here are the three winning teams and their Hackathon ideas:
First prize: Anonymous (Ajay Narang, Anshulika Prasad, Prerna Srivastava, Vidushi Khatri)
Their idea was to build a technology that can help targeting advertisements at an entirely different level. Right now the ads are targeted to the users based on psychographics, demographics, behavioral traits etc. This product will enable targeting ads based on the contents of Videos and Images a user is interested in.
Second Prize: Image feature extraction (Abhinay Swaroop, Kushal Wadhwani, Praveen DS, Ravi Prakash)
They were trying to detect whether an image has objectionable content or not. The scope of this project is limited to skin exposure and not object detection. This was a Vizury team!
Third Prize: email@example.com (Satyendra Bilthare, Krishna Kumar, Kunal Dexit, Partha Konwar)
Generally, when someone submits a post on Facebook, and if we find it attractive, we start searching in the Internet about the availability of the product/services nearby our place of stay. This process itself requires a lot of manual work and reviews of the recommendations, before actually purchasing the product or utilizing the service. This manual process will take a lot of time to give him a desired result. They search for the specific '#category' specified with #HashTag nearby to his place of stay, and provide him all the #ads sorted by the maximum popular one. Further this idea can be generalized to more sectors such as Job Posting, Bookstores, Movies, etc. These ads will be a great source of revenue generation.
Here are some of the coolest hack ideas (and the brains behind them) that made it to the finals.
- ML_Hackers (Srikanth Vidapanakal, Pratik, Nikhilesh): Their idea was to classify Malware using Large Scale Machine Learning. Malware classification requires analyzing about ~0.5 TB of data which is a tedious process.
- techvine (Aditya Srivatsav, Adit Chauhan, Rohit Gupta): Most of the time we get to know about the important dates and notes from some other person via Whatsapp, Email or Facebook. Now after recieving them we just save them in reminder apps like google keep or evernote. In all these reminder apps, we have to save the reminder or stick notes so that we can reminded by them. But these apps aren't smart enough to create reminders from your Whatsapp, Emails or Facebook messages. Now with the app remify, it automatically saves the important messages by its own. Now with this people may be able to set reminders and alarms in their friends, collegues, classmates or in their co-workers phones. In this way they can remind people about important dates and meetings.
- @lpha_d0gs (Aravind Sundaresan, G Arun Kumar, Irfan Basha): Their idea was to perform analysis of data obtained from tweets sent out on an event recently hosted. This data can be used by companies to see where their products (launched at the event) stand among the users in terms of popularity. This also allows company managers to see the amount of positive and negative feedback given by those interested in the product thereby depicting market trends and the needs of customers.
- 5minus2 (Sanjana Arun, Tarun S, Rahul Nagaraj): Skin related diseases are one of the most irritating health issues nowadays. Be it pimples, patches, infections etc or serious skin cancer patches. Medical imagery containing Gigabytes and Terabytes of data sets or image sets,this large big data to be processed,trained and classified based on the samples of the input of the skin patches to get accurate results. They include image processing to process the image samples and machine learning + big data to train and classify the image sets into different stages of severity. Our hack will thus focus on this field of Big data Medical imagery. Focusing on skin cancerous images, they scan the mole or patch from the image sample and process it thus monitoring the differences time to time, calculating total dermatoscopic value TDV hence showing melanoma content, giving severity conditions. The person will be given, either a reference to a dermatologist through the app, mail or otherwise.
- Chronological Behavior (Jaydev Acharya, Amit Kumar): Their idea was to target specific user based on their on-line shopping trends. Based on the users shopping trends like date and time they collect data and use its intelligence to build a profile which will target those users on a periodic basis and will notify them with our banner ads for their shopping and offers. For example:
- A user is travelling for certain festivals/ vacation trips. We collect data of the trips and based on our data manipulation we will notify them on the same time regarding flight offers and making their booking easier.
- A user shops for grocery every month on a specific date range we will collect their grocery list through their shopping cart and each month prior to their grocery shopping day we will show them ads as a reminder for their shopping with list which they have bought every month. This will make it easier and will not have to add each items into shopping cart.
- 17-- (Rahul Dominic, Vivek Vaidya, Ankit Siva): Their idea is to build an app which is a multi-platform application that works on the principles of machine learning, NLP, and textual analysis to bring crisp and precise content to consumers. It initially asks the users to input free form interests into the application. This serves as a basis for their entire experience in the application. A web script curates articles from various sources on the internet of various fields and condenses it to its essential points. This is stocked back-end for a single day. Based on the information by the user, the application pulls relevant data and presents it to the user. It is for people who want a personalized news app that gives them content suited to their tastes, in a crisp and concise format. For deeper reads, they will be referred to the parent site for the article. This is made for those who are busy and want information right then and there but is not limited to that segment of people.
- R Power (Mahesha Hiremath): Their idea for the hack was around predictive text analytics. That is, if there is are lots of unstructured text documents, and suppose each text will have some impact on user behavior (for example it can be ad, or email message, or even posts in social, media), the use of predicitive text analytics helps predict which text idea will bring good result in terms of likes or sales of the product or something similar. This can have good application in ads, social media marketing, etc.
- HyperLoop (Amitabh Das, Ashrit Shankar, Jeevan Raghu): Their idea was to implement a recommendation system that makes use of both item and collaborative filtering techniques. In theory it overcomes a lot of the standard problems that a normal recommendation engine faces.
- Verloop (Guarav Singh): His hack idea was to push contextual ads to mobile, based on user location.
- 2PC (Mayur Mohite, Santhosh Kumar, Somanath Reddy): Apache sqoop is a tool for loading bulk data from HDFS to relational databases in a horizontally scalable manner. Current implementation of sqoop ensures atomicity of data load by loading the data from HDFS into a staging table in RDBMS by using a map only job. The data export to staging table succeeds if all the mappers succeeds, otherwise it fails. If the data load is successful, the data from staging table is moved to the “main” table via a single transaction. This method of copying data from HDFS to staging table and then to the main table has several disadvantages:
- Duplicate copying of data from staging to main table
- Space consumed by the staging table will be large if the data set size is large.
- Time taken by this process is also huge. Two phase commit protocol is a distributed algorithm which ensures that the distributed processes participating in a transaction do it in an atomic manner. In our problem, multiple mappers loading the data in RDBMS are the distributed processes, we will use two phase commit protocol to copy the data from HDFS directly into the main table in the RDBMS without using staging table. Two phase commit will ensure that this distributed transaction is atomic.
- Bid by Price Difference (Premnath Thirumalaisamy, Supreeth Chandrashekhar): Given the frequency of daily discounts and offers, the product price changes very frequently. The idea is to update the user profile with the product price at the time of visit, and whenever bid request is made, bid higher for those products where there is a significant drop in price. This can lead to increase in CTR in markets like India, where drop in price is one of the decisive factors for purchase.
- LumberJacks (Anurag Bhowmick, Madhavi Gokana, Sukrut Hukkerikar, Varun Pahwa): Their idea was to store images which are in distributed NFS to Document Store like Mongodb. And for cases like refresh images or delete stale images, use some graph database which can ease the join efforts and give more insights by analyzing the nodes in the database.
Main Motivation of trying out some database instead of NFS:
1. Linux File System: Directory have a limit on maximum number of files.
2. Replication and sharing is hard to manage in case of NFS.
3. Scanning for the stale images on NFS can choke the system.
- Troublemakers (Chandrakanth Reddy Angeri, MathanV, Sanketh Dhopeshwarkar): The idea here is to build a framework for end-to-end verification of new servers getting build prior to deploying them. We replicate live traffic from one-of-the production servers and pass it to new machine. We will then capture all application/system/performance metrics of this new machine. This will help in deciding whether the new machine can be taken live or not. This feature is useful in all pre-production testing scenarios when testing new features/code changes or a new hardware.
- Onliner (Jatinderpal Singh, Sumit Khanna): They planned to do: 1. Hash Join through bloom filter 2. Better control over join through cogroup and detecting left skewness, right skewness, data duplication in sets 3. Decoupling parallel code from sequential code - 10x improvement over intellibid.
Here's us thanking you all for your participation in VizHack 2015!