Nucleon Map&Reduce

Pure .Net  Distributed Computing

NucleonGrid-300x200Nucleon Map&Reduce is a distributed data computing software for relational and NoSQL database and file systems. Nucleon Map&Reduce uses Microsoft .Net WCF based distributed computing framework.

Architecture

Nucleon DCF (Distributed Computing Framework) is a highly scalable storage platform designed to process very large data sets across hundreds to thousands of computing nodes that operate in parallel. The term MapReduce actually refers to two separate and distinct tasks that  Nucleon DCF programs perform.

Map&Reduce is a programming paradigm which allows to map and reduce jobs as distributed and parallel. A MapReduce program is composed of a Map method that performs filtering and sorting and a Reduce method that performs a summary operation.

Nucleon DCF orchestrates the processing by marshaling the distributed servers (nodes), running the various tasks in parallel, managing all communications and
data transfers between the various parts of the system, and providing for redundancy and fault tolerance. Nucleon DCF uses Microsoft WCF technology for distributed computing and communication between nodes.

Nucleon DCF (Map&Reduce) is inspired and works like Apache Hadoop® System. It is a pure Microsoft .Net WCF based framework which only runs currently Microsoft Windows environments. Nucleon Map&Reduce can scale to hundreds of server cores for analyzing data or files using data streaming.

Master Node

Main Node (Job Tracker and Maintainer) is the point of interaction between master and clients and the map/reduce framework. When a map/reduce job is submitted, Job manager puts it in a queue of pending jobs and executes them on a first-come/first-served basis and manages the assignment of map and reduce tasks to the task trackers.

Worker Node

Worker Node (Task Tracker and Manager) execute jobs (tasks) upon instruction from the Master node (Job Tracker) and also handle data motion between the map and reduce phases. Nucleon Map&Reduce provides always at least one local worker node for code executing, data computing and processing.

Database System

MapReduce application can execute Map&Reduce code against database systems, process the tables and views. The database table or view result will be distributed over the worker server nodes. The result of the map nodes will be filtered on the main node.

File System

MapReduce application can execute Map&Reduce code against file systems, process the files.  The file content distributed over the server nodes and result will be filtered on the main node.

Main Features

  • Job Distribution, Reduction and Finalization
  • C# Code for Map&Reduce
  • R Statistics Support
  • Query, Filter, Compute and Visualize Results
  • Map&Reduce to File System
  • Map&Reduce to Database Tables