Blog Archive

Thursday, December 7, 2017

Big Data understanding

                                               Building Blocks for Big Data Project

 -        Working knowledge on Hadoop & Hadoop Ecosystem
o   Be comfortable with basic Linux commands
o   Dataware housing Knowledge and SQL commands
o   Programming concepts like Java, Python, R, Pearl etc.
-        Understanding data structure & Business objective
-        Data visualization tools like Tableau, Qlickview, Jasper reports etc.
-        Be comfortable with analytics tools like R, Python, Spark, SAS etc.
-        Be comfortable with statistics (exploratory) and machine learning algorithms




What disrupted the Data Center?




Every industry is graced with more data…

• Richer transnational data from portfolio of dozens or hundreds of
    business applications
• Usage and behavior data from web and mobile apps
• Social media data
• Sensor and event data from IoT devices
• Data economy – firms buying and selling data
• Derived data from analytics

What is the challenge?

• The challenges include capture, curation, storage, search, sharing
   transfer, analysis and visualization
• The main challenge lies in identifying the value, the relevant          information within this data, and then transforming and extracting that data for further analysis.


What is Bigdata?

• Is it technology?
• Is it solution?
• Is it problem?
• Is it platform?
• Is it statement/phrase?

Big Data – 4 V’s
  •  According to IDC(International Data Corporation) the size of digital universe at 4.4 zettabytes in 2013 and forecasting a tenfold growth by 2020 to 40 zettabytes
  • A zetta bytes is (10)21 bytes or thousands of exabytes or one million petabytes or one billion terabytes
  • The NYSE generates about 4-5 terabytes of data per day
  • Facebook hosts more than 240 billion photos, growing at 7 petabytes per month


IBM’s Definition of Big Data


Big data – Myths

·        It’s Big : You need to have lots of data to talk about
big data
·       You need to apply it right away
·       The more granular the data, the better
·       Big Data is good data
·       Big Data means that analysts become all-important
·       Big Data gives you concrete answers
·       Big Data predicts the future
·       Big Data is a magical solution
·       Big Data can create self-learning algorithms
·       Big Data is only for big corporations
·       We Have So Much Data, We Don't Need to Worry
About Every Little Data Flaw
·       Big Data Technology Will Eliminate the Need for
Data Integration
·       It's Pointless Using a Data Warehouse for Advanced
Analytics
·       Data Lakes Will Replace the Data Warehouse
·       Hadoop is the holy grail of big data
·         Machine Learning Overcomes Human Bias


Big Data- Scenarios





What is Hadoop?
  •     Hadoop is an Open-Source Data Management framework with scale-out storage &distributed processing

Hadoop is not a database. Hadoop (from Apache Software Foundation) is a Java-based software framework for scalable,decentralized software applications that supports easy handling and analyzing of vast data volumes.





Existing Data Architecture



Limitations of Existing Data Analytics Architecture




An Emerging Data Architecture



Emerging Data Analytics Architecture



DBMS vs. HADOOP







Why Hadoop?


·        Supports use of inexpensive, commodity hardware
                -No RAID needed. Also, the servers need not be the latest                 and greatest hardware.
·        Provides for simple, massive parallelism
·        Provides resilience by replicating data and eliminating tape backups
·        Provides locality of execution, as it knows where the data is
·        Software free
·        High quality support available at modest cost
·        Certification available
·        Easy to support when using GUI such as Cloudera Manager or Ambari
·        Add-on tools available at relatively low cost, or in some cases no cost

·        Evolving technology with a high degree of interest around the world


Hadoop Ecosystem





Analytics mapping – Hadoop 1.x



Analytics mapping – Hadoop 2.x





Typical Big Data Project – Role of Hadoop Ecosystem




Opportunity and Market Outlook



Who is using Hadoop?




Which companies Implemented Hadoop?

http://wiki.apache.org/hadoop/poweredBy



Next post would be on Hadoop 2X.......

Used information from Analytic lab





74 comments:

  1. Thanks for posting the useful information to my vision. This is excellent information
    Awesome,
    big data analytics consulting companies
    big data and analytic solutions

    ReplyDelete
    Replies
    1. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai

      Spring Framework has already made serious inroads as an integrated technology stack for building user-facing applications. Corporate TRaining Spring Framework the authors explore the idea of using Java in Big Data platforms.

      Spring Training in Chennai

      The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

      Delete
  2. Nice work, your blog is concept oriented ,kindly share more blogs like this Tableau Online Course

    ReplyDelete
  3. This blog has very effective content about Big data thanks for sharing this type of information.

    Hadoop big data classes in pune

    ReplyDelete
  4. Nice post.Thanks for sharing this post. Machine Learning is steadily moving away from abstractions and engaging more in business problem solving with support from AI and Deep Learning. With Big Data making its way back to mainstream business activities, to know more information visit: Pridesys IT Ltd

    ReplyDelete
  5. Big data is big business. Eleanor O'Neill takes a look at ten of the companies using data and analytics to gain a competitive edge.
    The term 'big data' refers to extremely large sets of digital data that may be analysed to reveal patterns, trends and associations relating to human behaviour and interactions.

    Companies can use this information to their advantage; automating processes, gaining insight into their target market and improving overall performance using the feedback readily available.

    Here we look at some of the businesses integrating big data companies and how they are using it to boost their brand success.

    1. Amazon
    The online retail giant has access to a massive amount of data on its customers; names, addresses, payments and search histories are all filed away in its data bank.

    While this information is obviously put to use in advertising algorithms, Amazon also uses the information to improve customer relations, an area that many big data users overlook.

    The next time you contact the Amazon help desk with a query, don't be surprised when the employee on the other end already has most of the pertinent information about you on hand. This allows for a faster, more efficient customer service experience that doesn't include having to spell out your name three times.

    2. American Express
    The American Express Company is using big data to analyse and predict consumer behaviour.

    By looking at historical transactions and incorporating more than 100 variables, the company employs sophisticated predictive models in place of traditional business intelligence-based hindsight reporting.

    This allows a more accurate forecast of potential churn and customer loyalty. In fact, American Express has claimed that, in their Australian market, they are able to predict 24% of accounts that will close within four months.

    3. BDO
    National accounting and audit firm BDO puts big data analytics to use in identifying risk and fraud during audits.

    Where, in the past, finding the source of a discrepancy would involve numerous interviews and hours of manpower, consulting internal data first allows for a significantly narrowed field and streamlined process.

    In one case, BDO Consulting Director Kirstie Tiernan noted, they were able to cut a list of thousands of vendors down to a dozen and, from there, review data individually for inconsistencies. A specific source was identified relatively quickly.

    4. Capital One
    Marketing is one of the most common uses for big data and Capital One are at the top of the game, utilising big data management to help them ensure the success of all customer offerings.

    Through analysis of the demographics and spending habits of customers, Capital One determines the optimal times to present various offers to clients, thus increasing the conversion rates from their communications.

    Not only does this result in better uptake but marketing strategies become far more targeted and relevant, therefore improving budget allocation.

    5. General Electric (GE)
    GE is using the data from sensors on machinery like gas turbines and jet engines to identify ways to improve working processes and reliability.

    The resultant reports are then passed to GE's analytics team to develop tools and improvements for increased efficiency.

    The company has estimated that data could boost productivity in the US by 1.5%, which, over a 20-year period, could save enough cash to raise average national incomes by as much as 30%.

    ReplyDelete
  6. Big data analytics is the often complex process of examining large and varied data sets -- or big data -- to uncover information including hidden patterns, unknown correlations, market trends and customer preferences that can help organizations make informed business decisions.

    On a broad scale, data analytics technologies and techniques provide a means to analyze data sets and draw conclusions about them to help organizations make informed business decisions. BI queries answer basic questions about business operations and performance.

    Big data analytics is a form of advanced analytics, which involves complex applications with elements such as predictive models, statistical algorithms and what-if analysis powered by high-performance analytics systems.

    The importance of big data analytics in uae
    Driven by specialized analytics systems and software, as well as high-powered computing systems, big data analytics companies offers various business benefits, including new revenue opportunities, more effective marketing, better customer service, improved operational efficiency and competitive advantages over rivals.

    Big data analytics applications enable big data analysts, data scientists, predictive modelers, statisticians and other analytics professionals to analyze growing volumes of structured transaction data, plus other forms of data that are often left untapped by conventional business intelligence (BI) and analytics programs. That encompasses a mix of semi-structured and unstructured data -- for example, internet clickstream data, web server logs, social media content, text from customer emails and survey responses, mobile phone records, and machine data captured by sensors connected to the internet of things.

    ReplyDelete

  7. You have discussed an interesting topic that everybody should know. Very well explained with examples. I have found a similar website
    Analytics consulting firms
    visit the site to know more about Omdata.

    ReplyDelete
  8. I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly. For more info visit. data science consulting firims

    ReplyDelete
  9. This information you provided in the blog that is really unique I love it!!
    Machine Learning Training in delhi
    Machine Learning Course in delhi

    ReplyDelete
  10. the topic was interesting and informative.. thanks for sharing this article ........
    Surya Informatics

    ReplyDelete
  11. Thank you a lot for providing individuals with a very spectacular possibility to read critical reviews from this site.Surya Informatics

    ReplyDelete
  12. Let us first clarify the question. There is not any easy way to become good at anything but there is an efficient way to do everything.

    Let us try to understand the difference between easy and efficient here with the help of a programming question! Consider the problem of “Searching an element in a sorted array“.

    Person A solves the above problem by using Linear Search algorithm.
    Person B solves the above problem by using Binary Search algorithm.
    So, person A here solved the problem in an easy way yet Person B solved the problem in an efficient way.

    Now, the efficient way of learning Data Structures and Algorithms depends on a several factors:

    Your prior knowledge of programming languages and basic DS and Algos.
    The purpose for which you want to learn it.
    The resources available to you.
    A perfect guide!
    Let us now take a deeper look at each of the points highlighted above:

    Prior knowledge of DS and Algo: If you are already well versed with the basic data structures like Arrays, Linked Lists etc. and some of the basic algorithms like Sorting, Searching etc. then you will comparatively take much less time than a complete newbie as you already know the basics. For example if you even don’t understand the programming example mentioned at the start of the article then you have a long way to go.
    Purpose of learning DS and Algo: It also depends on the purpose for which you want to improve your knowledge of Data Structures. Some people learn them for job interviews, some for competitive programming and some for gaining knowledge. If you are preparing for Job Interviews then you have a limited set of Data Structures to learn which are most commonly asked in the interviews, if you want to become a good competetive programmer then you will have to focus on complex big data companies in dubai like Segment Trees, Fenwik Tree, Binary Indexed Trees etc.
    Resources Available: Resources play a most important role in learning anything. You need a set of good tutorials which are descriptive enough to clear all of the concepts from basics to advanced. You must also have a popular set of questions to practice the knowledge you have gained.
    A Guide: Let’s just say you have figured out all of the above three points. You know the things you want to learn, you know the purpose for which you want to learn and you also have all of the resources and tutorials to do so. But you are still confused on a lot of things like “Where to Start?”, “How to Start?” etc. So, you need someone to guide you through the process. That is there must be someone to help you use the resources available in an efficient way.

    ReplyDelete
  13. Superfastprocessing is the hub of fast processing solutions for telecom and finance industry. It offers resilient and accurate solutions for a number of business use cases including Big Data analytics and day-to-day transaction processing in the telecom industry.

    ReplyDelete
  14. This is amazing, very rare to find these type of blogs. Must say very well written.

    learn big data and hadoop

    ReplyDelete

  15. Hi, you know this article is helping for me and everyone and thanks for sharing information Big Data Training Institute in Delhi

    ReplyDelete
  16. Hi, Amazing you know this article is helping for me and everyone and thanks for sharing information Big Data Training in Delhi

    ReplyDelete

  17. It is very excellent blog and useful article thank you for sharing with us , keep posting Best AngularJs Training in Pune | RPA Training in Pune | Devops Certification Pune

    ReplyDelete
  18. The article is so appealing. You should read this article before choosing the Google cloud big data services you want to learn.

    ReplyDelete
  19. Nice blog. Thanks for sharing the useful information. Internship in bigdata

    ReplyDelete
  20. Capsule theory is an excellent concept to talk about, but you can't ignore the relation of capsule theories with data warehouse consultant .

    ReplyDelete
  21. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Simple Linear Regression
    Correlation vs covariance
    data science interview questions
    KNN Algorithm

    ReplyDelete
  22. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Simple Linear Regression
    Correlation vs covariance
    data science interview questions
    KNN Algorithm

    ReplyDelete
  23. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Simple Linear Regression
    Correlation vs covariance
    data science interview questions
    KNN Algorithm

    ReplyDelete
  24. The main motive of the data warehousing services company is to spread the knowledge so that they can give more engineers to the world.

    ReplyDelete
  25. It is 6 years almost the article been written and we have grown to a state where AI + Big data. Artificial Intelligence Grows Rapid Rate with Big Data. Cerexio Singapore gives best solutions on AI and Robotics

    ReplyDelete
  26. I am looking for and I love to post a comment that "The content of your post is awesome" Great work!

    Simple Linear Regression

    Correlation vs Covariance

    ReplyDelete
  27. Amazing Article ! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.
    Correlation vs Covariance
    Simple Linear Regression
    data science interview questions
    KNN Algorithm
    Logistic Regression explained

    ReplyDelete
  28. Your content is very unique and understandable useful for the readers keep update more article like this.
    data scientist certification

    ReplyDelete
  29. A Desicion Science Platform
    1st local data science platform offered as a service

    big data analytics



    big data




    big data technologies

    ReplyDelete
  30. The Hadoop framework allows for the distributed processing of large data sets across clusters of computers (which may be in the same datacenter or in different datacenters across the Internet). It is designed to scale up from single servers to thousands of machines, each with local storage.

    ReplyDelete
  31. Thanks for posting the best information and the blog is very important.data science interview questions and answers

    ReplyDelete
  32. Fantastic blog extremely good well enjoyed with the incredible informative content which surely activates the learners to gain the enough knowledge. Which in turn makes the readers to explore themselves and involve deeply in to the subject. Wish you to dispatch the similar content successively in future as well.

    Data Science Training in Bhilai

    ReplyDelete
  33. Amazingly by and large very interesting post. I was looking for such an information and thoroughly enjoyed examining this one. Keep posting. An obligation of appreciation is all together for sharing.data analytics course in gwalior

    ReplyDelete
  34. It's instructive and you are clearly entirely educated here. You have made me fully aware of differing sees on this point with fascinating and strong substance. data analytics course in mysore

    ReplyDelete
  35. Good post but I was wondering if you could write a litte more on this subject? I’d be very thankful if you could elaborate a little bit further. Appreciate it 안전놀이터

    ReplyDelete
  36. I am impressed by the information that you have on this blog. It shows how well you understand this subject.
    full stack web development course in malaysia

    ReplyDelete
  37. The new wave of innovation that is changing the way people do business is called data science. Gain expertise in organizing, sorting, and transforming data to uncover hidden patterns Learn the essential skills of probability, statistics, and machine learning along with the techniques to break your data into a simpler format to derive meaningful information. Enroll in Data science in Bangalore and give yourself a chance to power your career to greater heights. best data science courses in chennai

    ReplyDelete
  38. I'm always looking online for articles that can help me. I think you also made some good comments on the functions. Keep up the good work!
    data science training in mangalore

    ReplyDelete
  39. Data science has been a great career for most people, and many struggles to come into this field.
    data science course in gurgaon

    ReplyDelete
  40. I'm always looking online for articles that can help me. I think you also made some good comments on the functions. Keep up the good work!
    data science training in mangalore

    ReplyDelete
  41. Very informative message! There is so much information here that can help any business start a successful social media campaign!
    data science training in london

    ReplyDelete
  42. This is a great post I saw thanks to sharing. This is really what I wanted to see, I hope they continue to share such a great article in the future.
    data science certification in mangalore

    ReplyDelete
  43. Data scientists are the most valuable persons in the field. One person with prerequisites in the data science field can apply for the job of a data scientist.
    data science training in kanpur

    ReplyDelete
  44. Our Data Science certification training with a unique curriculum and methodology helps you to get placed in top-notch companies.


    ReplyDelete
  45. I am a new user of this site, so here I saw several articles and posts published on this site, I am more interested in some of them, will provide more information on these topics in future articles.
    data science course in london

    ReplyDelete
  46. Wow, what great information on World Day, your exceptionally nice educational article. a debt of gratitude is owed for the position.
    data science training in mangalore

    ReplyDelete
  47. Get Data Science Certification from+ top-ranked universities UTM, Malaysia, and IBM. We provide extensive training for the future-ready workforce.
    data analytics courses in hyderabad with placements

    ReplyDelete
  48. 360DigiTMG is the best Data Science training institute with placement assistance. To know about the course fee details, click the link below: best data science courses in chennai

    ReplyDelete
  49. Register for the Data Science certification in Bangalore and gain recognition and credibility in your organization. Learn the techniques to examine large data sets and discover patterns that are valuable to predict market trends. Learn Data Exploration and Visualizations, Neural Networks and Deep Learning, Model Evaluation and Analysis. This course will ensure that you are challenged to go from a beginner with no Data Science experience to someone who can juggle data with ease.data scientist course in delhi

    ReplyDelete
  50. Advance your technical skills required to crack huge datasets to bring out new possibilities from data. Join the Data Science institutes in vijayawada and get access to top industry trainers, LMS, live projects, assignments, and mock interviews to skyrocket your career in the ever- evolving field of Data Science.
    Data Science training in vijayawada

    ReplyDelete
  51. Get a comprehensive overview of Data Science and learn all the essential skills including collecting, modeling, and interpreting data. Register with Data Science institute vijayawada and build a strong foundation for a career where you will be involved in uncovering valuable information for your organization. Learn Python, Machine Learning, Big Data, Deep Learning, and Analytics to take center stage in Data Science.
    Data analytics training in vijayawada

    ReplyDelete
  52. This article deals with a wide spectrum of information for aspirants and the road map to get success in careers. I have bookmarked this article and will share this with others. Keep up the good work, and keep posting more articles. Data science course programs contain a structured manner that helps students learn how to make data-driven decisions.
    data analytics course in pune

    ReplyDelete
  53. This article deals with a wide spectrum of information for aspirants and the road map to get success in careers. I have bookmarked this article and will share this with others. Keep up the good work, and keep posting more articles. Data science course programs contain a structured manner that helps students learn how to make data-driven decisions.
    data analytics course in pune

    ReplyDelete
  54. Great Info, Thanks For Sharing , keep it up we are here to learn more

    Great! I like to share it with all my friends and hope they will also like this information.
    Power BI Training In Hyderabad
    Power BI Online Training
    Power BI Training
    Power BI Training In Ameerpet
    Power BI Training Online

    ReplyDelete
  55. Thanks for sharing this informative article on Big Data understanding. If you want to Big Data Services for your project. Please visit us.

    ReplyDelete
  56. Data analyst handles structured and unstructured and data that is generated at an unprecedented rate every day. Anyone with a strong statistical background and an analytical mindset enjoys the challenges of big data that involves building data models and software platforms along with creating attractive visualizations and machine learning algorithms. Sign up for the Data Science courses in chennai with Placements and get access to resume building and mock interviews that will help you get placed with top brands in this field.
    data analyst course in chennai

    ReplyDelete
  57. Gain mastery over the core principles of data analytics and get ready to work with top companies. Get acquainted with the bright and exciting future of data science by enrolling in the best data analytics institute in Bangalore. Learn to empower more meaningful business decisions by representing data with tools of visualization.data analyst course in bangalore

    ReplyDelete