Top 10 IBM Big Data & Analytics Hub podcasts of 2017

It can be difficult to keep up with all the best podcast episodes during the year. That's why we've compiled the Top 10 podcasts of the year from the IBM...

Fast clustering algorithms for massive datasets

Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex clustering structures. By extremely fast,...

Big data set – 3.5 billion web pages – made available for all of...

We provide the hyperlink graph on four different levels of aggregation: Page-Level Graph - This version of the graph contains all details with each node representing a single web page and each...

Another large data set – 250 million data points – available for download

This is the full resolution GDELT event dataset running January 1, 1979 through March 31, 2013 and containing all data fields for each event record. The years 1979 through 2005,...

Topic Modeling in R – Big Data News

As a part of Twitter Data Analysis, So far I have completed Movie review using R& Document Classification using R. Today we will be dealing with discovering topics in Tweets, i.e. to mine...

Data Lakes Still Need Governance Life Vests

As a central repository and processing engine, data lakes hold great promise for raising return on data assets (RDA).  Bringing analytics directly to different data in its native formats can...

Data has always existed, the key is the right data

What does The Library of Alexandria, The Normans and a book have to do with data? I never thought about The Library... ...at Alexandria was in charge of collecting all the world's knowledge,...

Hadoop Yarn explanation and container memory allocations

Yarn Resource manager (The Yarn service Master component) 1) Controls of the total resource capacity of the cluster 2) Whatever the container is needed in the cluster it sets the minimum container size...

7 Tools to extract text from HTML document

I want to share an interesting article about data scaping that you might need in your business. The article below is mainly reprinted from here.  Text in the HTML document is the content...

Data Wars: Dawn of the Yottabyte

Big Data is an accumulation of data that is too large and complex for processing by traditional database management tools. -Merriam Webster   Yeah But, What Really Makes Big Data Big Data?  This...

APLICATIONS

How women are helping to fight cybercrime – Naked Security

Today is International Women’s Day. And, in celebration of just some of the women working to fight cybercrime, we asked a number of...

HOT NEWS