HIHO at HUG
- Oct
- 11
We made a presentation regarding HIHO at Hadoop User Group India meetup hosted in July, 2010 by Impetus Technologies. Here are the videos of the meet. HIHO at HUG India HIHO at HUG India Presentation for the talk can be downloaded here
read moreWorking with Cascading
- Oct
- 05
Map Reduce applications tend to get very complex due to the sheer volume of the data and machines. Designing and debugging map reduce spread over many machines is an art in itself. If we add a framework like Cascading, we save time and effort as we can abstract Map Reduce and think more in terms [...]
read moreAmazon Elastic Map Reduce Lessons Learnt
- Oct
- 03
Elastic Map Reduce is a great web service to get up and running with Hadoop without setting up own clusters. We recently worked on a vertical search engine using EMR. As part of our processing, we had our initial data on S3, and we also wanted to place the fetched data on S3. We were [...]
read moreWhy HIHO?
- Oct
- 03
- Posted by admin
- Posted in Uncategorized
Currently, there is little support in Hadoop for querying the database and getting the results. Significant effort and time has to be spent by the application developer to extract the data from the database. The existing DBInputFormat and DataDrivenInputFormat are table based, so if one wants to get data from multiple tables, one has to [...]
read moreHello HIHO!
- Oct
- 03
- Posted by admin
- Posted in General, Uncategorized
We are glad to announce the beta release of HIHO, an open source framework for integrating datastores with Apache Hadoop. This post introduces HIHO and talks briefly about its capabilities. Data which needs to be analysed in Hadoop is often stored in conventional data stores. Typical data analysis tasks can be: 1. Match profile information [...]
read more