Our community is talking about the new Dell Technologies. Join the discussion in the Dell EMC Community Network:
We like to show how good our Migrator for Notes to SharePoint migration tool is at migrating content. Fidelity of content migration is important to us and our customers.
So I created two new documents in one of our test Notes databases. One document contained two image file attachments. The other document contained the same images; but has embedded images.
I had two copies of the same #TheDress image. One is black and blue and the other is gold and white. I cannot honestly tell which one is which.
I migrated the IBM Notes documents to a Discussion list in O365.
The document with file attachments looks like below in IBM Notes.
In O365, the same migrated document appears as below.
If I click on View Properties, I will see the file attachments.
The IBM Notes document with embedded images appears as below.
The migrated document looks appears as below in O365.
The good news is that the images migrated successfully from IBM Notes to O365 and maintained their image quality.
However, the problem still remains ... what colors do you see in #TheDress? You can decide what the colors are in IBM Notes and after they are migrated to O365 knowing that they are the same image.
by Danny Stout
Manufacturers have a long history of successfully employing data - big data - to help make important and insightful business decisions. According to a recent article in CMS WiRE by Joanna Schloss, a subject matter expert specializing in data and information management at Dell, early adoption means the industry is set to be a primary benefactor of the big data analytics boom.
Schloss submits that as an early adopter of big data, with a ubiquitous presence in society, and unparalleled access to data collection, the manufacturing industry has a plethora of new revenue streams available to it. In her article, she outlines three:
The potential of big data is now a reality for every industry. Manufacturing just happens to be positioned to immediately begin delivery and reaping the rewards.
You can read all of Joanna's insights here.
by Uday Tekumalla
Predictive analytics are used by companies for everything from customer retention and direct marketing to forecasting sales. But at the University of Iowa Hospitals and Clinics, predictive analytics are serving a far more noble purpose - to decrease post-surgical infections.
By utilizing a number of different data points that were gathered from 1,600 patients, each of whom has had colon surgery performed at the University's hospitals, the medical teams have dramatically reduced the number of patients inflicted with post-surgical infections. In fact, over a two-year period, those infections were slashed by an impressive 58-percent.
That is an impressive feat. There are, after all, a multitude of variables that can lead to an infection. This analysis considered several different data points - patients’ medical history, data from monitoring equipment, data from national registries, and real time data collected while the surgery is being performed like blood loss, wound contamination, etc. The University built predictive models using Dell Statistica predictive analytics software to achieve these impressive results. Running this analysis allows the hospital to determine a patient's risk level for post-surgical infection, providing the medical team with clear insight into the medications and treatment plans to employ going forward to minimize the risk of infection.
Along with providing better patient outcomes the University of Iowa also has likely reduced medical costs. This is an exciting example of the potential of predictive analysis. Learn more about the university's results here.
What are you using to compare and synchronize database schemas? DBAs need to be confident that deployment scripts accurately reflect the changes that were intended to come from development. The need for accurate scripts becomes even more urgent in companies moving toward Continuous Deployment.
We saw an ideal way to build those functions into Toad for Oracle Xpert Edition with Compare Schema & Sync. In fact, we’ve added a new feature, Compare Multiple Schemas & Sync, especially useful for DBAs who have to maintain the accuracy of multiple schemas between two databases.
Watch our John Pocknell demonstrate how to compare schemas in this Toad Xpert Edition video (3 minutes):
The results show the differences between the schemas, such as any objects present in one and absent from the other. Toad can also generate a synchronization script for applying the changes to the target database.
This is the final post in my 5-part video series designed to help you decide whether Toad for Oracle Xpert Edition is right for you. If the possibility of inaccurate schema deployments is keeping you up at night, have a look at Reason #5 in our updated technical brief, “Five Ways Toad Xpert Edition Can Help You Write Better Code,” for more details on comparing schemas.
And, if you want to compare and synchronize a few hundred schemas in your own databases, download a 30-day trial and take Toad Xpert Edition for a test drive.
All of the student teams attending this summer's HPC Advisory Council's International Supercomputing Conference (HPCAC-ISC) have the same goal in mind: win the student cluster competition. But the new team from South Africa may feel some added pressure when they arrive in Frankfurt this July. They hope to become the third consecutive champions from their country.
Team "Wits-A" from the University of Witwatersrand in Johannesburg won the right to defend South Africa's title at ISC '15 during the South African Center for High Performance Computing's (CHPC) Ninth National Meeting held in December at Kruger National Park. The students bested 7 other teams from around South Africa.
As part of their victory, the South Africans recently traveled to the United States. On their itinerary was a tour of the Texas Advanced Computing Center (TACC) where they had the opportunity to see the Visualization Laboratory (Vislab) and Stampede Supercomputer, while gaining insights about how to best compete at the ISC '15 Student Cluster Challenge in July. Also on the itinerary was a Texas tradition - sampling some down home BBQ!
Hoping for that three-peat win are Ari Croock, James Allingham, Sasha Naidoo, Robert Clucas, Paul Osel Sekyere, and Jenalea Miller, with reserve team members are comprised of Vyacheslav Schevchenko and Nabeel Rajab.
You can learn more about Team South Africa here.
Whittaker finds it increasingly difficult to talk about analytics without also talking about the role of cloud.
In her latest CMSWire article, Schloss describes three reasons why manufacturers are uniquely poised to be primary benefactors of the big data analytics boom.
Rogers also reacts to the recent Enterprise Management Associates (EMA) research study, reflecting on the competitive advantages that can result from “cloud first” thinking
by Suzanne Tracy
Some 4,100 genetic diseases affect humans. Tragically, they are also the primary cause of death in infants, but identifying which specific genetic disease is affecting an inflicted child is a monumental task. Increasingly, however, medical teams are turning to high performance computing and big data to uncover the genetic cause of pediatric illnesses.
Through the adoption of HPC and big data, clinicians are now able to accelerate the delivery of new diagnostic and personalized medical treatment options. Successful personalized medicine is the result of analyzing genetic and molecular data from both patient and research databases. The usage of high performance computing allows clinicians to quickly run the complex algorithms needed to analyze the terabytes of associated data.
The marriage of personalized medicine and high performance computing is now helping to save the lives of pediatric cancer patients thanks to a collaboration between Translational Genomics Research Institute (TGen) and the Neuroblastoma and Medulloblastoma Translational Research Consortium (NMTRC).
The NMTRC conducts various medical trials, generating literally hundreds of measurements per patient, which then must be analyzed and stored. Through a ground-breaking collaboration between TGen, Dell and Intel, NMTRC is now using TGen’s highly-specialized software and tools, which include Dell’s Genomic Data Analysis Platform and cloud technology, to decrease the data analysis time from 10 days to as little as six hours. With this information, clinicians are able to quickly treat their patients, and dramatically improve the efficacy of their trials.
Thanks to the collaboration, NMTRC has launched personalized pediatric cancer medical trials to provide near real-time information on individual patients' tumors. This allows clinicians to make faster and more accurate diagnoses, while determining the most effective medications to treat each young patient. Clinicians are now able to target the exact malignant tumor, while limiting any potential residual harm to the patient.
You can read more about this inspiring collaboration here.
By Armando Acosta
The Apache™ Hadoop® platform speeds storage, processing and analysis of big, complex data sets, supporting innovative tools that draw immediate insights.
Big data has taken a giant leap beyond its large-enterprise roots, entering boardrooms and data centers across organizations of all sizes and industries. The Apache Hadoop platform has evolved along with the big data landscape and emerged as a major option for storing, processing and analyzing large, complex data sets. In comparison, traditional relational management database or enterprise data warehouse tools often lack the capability to handle such large amounts of diverse data effectively.
Hadoop enables distributed parallel processing of high-volume, high-velocity data across industry-standard servers that both store and process the data. Because it supports structured, semi-structured and unstructured data from disparate systems, the highly scalable Hadoop framework allows organizations to store and analyze more of their data than before to extract business insights. As an open platform for data management and analysis, Hadoop complements existing data systems to bring organizational capabilities into the big data era as analytics environments grow more complex.
Early adopters tended to utilize Hadoop for batch processing; prime use cases included data warehouse optimization and extract, transform, load (ETL) processes. Now, IT leaders are expanding the application of Hadoop and related technologies to customer analytics, churn analysis, network security and fraud prevention — many of which require interactive processing and analysis.
As organizations transition to big data technologies, Hadoop has become essential for enabling predictive analytics that use multiple data sources and types. Predictive analytics helps organizations in many different industries answer business-critical questions that had been beyond their reach using basic spreadsheets, databases or business intelligence (BI) tools. For example, financial services companies can move from asking “How much does each customer have in their account?” to answering sophisticated business enablement questions such as “What upsell should I offer a 25-year-old male with checking and IRA accounts?” Retail businesses can progress from “How much did we sell last month?” to “What packages of products are most likely to sell in a given market region?” A healthcare organization can predict which patient is most likely to develop diabetes and when.
Using Hadoop and analytical tools to manage and analyze big data, organizations can personalize each customer experience, predict manufacturing breakdowns to avoid costly repairs and downtime, maximize the potential for business teams to unlock valuable insights, drive increased revenue and more. [See the sidebar, “Doing the (previously) impossible.”]
Effective use of big data is key to competitive gain, and Dell works with ecosystem partners to help organizations succeed as they evolve their data analytics capabilities. Cloudera plays an important role in the Hadoop ecosystem by providing support and professional feature development to help organizations leverage the open-source platform.
The combination of Cloudera® software on Dell servers enables organizations to successfully implement new data capabilities on field-tested, low-risk technologies. (See section, “Taking Hadoop for a test-drive.”)
Dell | Cloudera Hadoop Solutions comprise software, hardware, joint support, services and reference architectures that support rapid deployment and streamlined management (see figure). Dell PowerEdge servers, powered by the latest Intel® Xeon® processors, provide the hardware platform.
Solution stack: Dell | Cloudera Hadoop Solutions for big data
Dell | Cloudera Hadoop Solutions are available with Cloudera Enterprise, designed specifically for mission-critical environments. Cloudera Enterprise comprises the Cloudera Distribution including Apache Hadoop (CDH) and the management software and support services needed to keep a Hadoop cluster running consistently and predictably. Cloudera Enterprise allows organizations to implement powerful end-to-end analytic workflows — including batch data processing, interactive query, navigated search, deep data mining and stream processing — from a single common platform.
Accelerated processing. Cloudera Enterprise leverages Hadoop YARN (Yet Another Resource Negotiator), a resource management framework designed to transition users from general batch processing with Hadoop MapReduce to interactive processing. The Apache Spark™ compute engine provides a prime example of how YARN enables organizations to build an interactive analytics platform capable of large-scale data processing. (See the sidebar, “Revving up cluster computing.”)
Built-in security. Role-based access control is critical for supporting data security, governance and compliance. The Apache Sentry system, integrated in CDH, enhances data access protection by defining what users and applications can do with data, based on permissions and authorization. Apache Sentry continues to expand its support for other ecosystem tools within Hadoop. It also includes features and functionality from Project Rhino, originally developed by Intel to enable a consistent security framework for Hadoop components and technologies.
Dell | Cloudera Hadoop Solutions, accelerated by Intel, provide organizations of all sizes with several turnkey options to meet a wide range of big data use cases.
Getting started. Dell QuickStart for Cloudera Hadoop enables organizations to easily and cost-effectively engage in Hadoop development, testing and proof-of-concept work. The solution includes Dell PowerEdge servers, Cloudera Enterprise Basic Edition and Dell Professional Services to help organizations quickly deploy Hadoop and test processes, data analysis methodologies and operational needs against a fully functioning Hadoop cluster.
Taking the first steps with Hadoop through Dell QuickStart allows organizations to accelerate cluster deployment to pinpoint effective strategies that address the business and technical demands of a big data implementation.
Going mainstream. The Dell | Cloudera Apache Hadoop Solution is an enterprise-ready, end-to-end big data solution that comprises Dell PowerEdge servers, Dell Networking switches, Cloudera Enterprise software and optional managed Hadoop services. The solution also includes Dell | Cloudera Reference Architectures, which offer tested configurations and known performance characteristics to speed the deployment of new data platforms.
Cloudera Enterprise is thoroughly tested and certified to integrate with a wide range of operating systems, hardware, databases, data warehouses, and BI and ETL systems. Broad compatibility enables organizations to take advantage of Hadoop while leveraging their existing tools and resources.
Advancing analytics. The shift to near-real-time analytics processing necessitates systems that can handle memory-intensive workloads. In response, Dell teamed up with Cloudera and Intel to develop the Dell In-Memory Appliance for Cloudera Enterprise with Apache Spark, aimed at simplifying and accelerating Hadoop cluster deployments. By providing fast time to value, the appliance allows organizations to focus on driving innovation and results, rather than on using resources to deploy their Hadoop cluster.
The appliance’s ease of deployment and scalability addresses the needs of organizations that want to use high-performance interactive data analysis for analyzing utility smart meter data, social data for marketing applications, trading data for hedge funds, or server and network log data. Other uses include detecting network intrusion and enabling interactive fraud detection and prevention.
Built on Dell hardware and an Intel performance- and security-optimized chipset, the appliance includes Cloudera Enterprise, which is designed to store any amount or type of data in its original form for as long as desired. The Dell In-Memory Appliance for Cloudera Enterprise comes bundled with Apache Spark and Cloudera Enterprise components such as Cloudera Impala and Cloudera Search.
Cloudera Impala is an open-source massively parallel processing (MPP) query engine that runs natively in Hadoop. The Apache-licensed project enables users to issue low-latency SQL queries to data stored in Apache HDFS™ (Hadoop Distributed File System) and the Apache HBase™ columnar data store without requiring data movement or transformation.
Cloudera Search brings full-text, interactive search and scalable, flexible indexing to CDH and enterprise data hubs. Powered by Hadoop and the Apache Solr™ open-source enterprise search platform, Cloudera Search is designed to deliver scale and reliability for integrated, multi-workload search.
Since its beginnings in 2005, Apache Hadoop has played a significant role in advancing large-scale data processing. Likewise, Dell has been working with organizations to customize big data platforms since 2009, delivering some of the first systems optimized to run demanding Hadoop workloads.
Just as Hadoop has evolved into a major data platform, Dell sees Apache Spark as a game-changer for interactive processing, driving Hadoop as the data platform of choice. With connected devices and embedded sensors generating a huge influx of data, streaming data must be analyzed in a fast, efficient manner. Spark offers the flexibility and tools to meet these needs, from running machine-learning algorithms to graphing and visualizing the interrelationships among data elements — all on one platform.
Working together with other industry innovators, Dell is enabling organizations of all sizes to harness the power of Hadoop to accelerate actionable business insights.
Joey Jablonski contributed to this article.
Apache Hadoop and big data analytics capabilities enable organizations to do what they couldn’t do before, whether that means making memorable customer experiences or optimizing operations.
Personalized content. A digital media company turned to Hadoop when burgeoning data volumes hindered its mission to simplify marketers’ access to data that would let them tailor content to individual customers. The company’s move to Cloudera Enterprise, powered by Dell PowerEdge servers, enabled complex, large-scale data processing that delivered greater than 90 percent accuracy for its content personalization services. Moreover, the 24x7 reliability of the Hadoop platform lets the company provide the data its customers need, when they need it.
Product quality management. To help global manufacturers efficiently manage product quality, Omneo implemented a software solution based on the Cloudera Distribution including Apache Hadoop (CDH) running on a cluster of Dell PowerEdge servers. Using the solution, Omneo customers can quickly search, analyze and mine all their data in a single place, so they can identify and resolve emerging supply chain issues. “We are able to help customers search billions of records in seconds with Dell infrastructure and support, Cloudera’s Hadoop solution, and our knowledge of supply chain and quality issues,” says Karim Lokas, senior vice president of marketing and product strategy for Omneo, a division of the global enterprise manufacturing software firm Camstar Systems. “With the visibility provided by this solution, manufacturers can put out more consistent, better products and have less suspect product go out the door.”
Information security services. Dell SecureWorks is on deck 24 hours a day, 365 days a year, to help protect customer IT assets against cyberthreats. To meet its enormous data processing challenges, Dell SecureWorks deployed the Dell | Cloudera Apache Hadoop Solution, powered by Intel Xeon processors, to process billions of events every day. “We can collect and more effectively analyze data with the Dell | Cloudera Apache Hadoop Solution,” says Robert Scudiere, executive director of engineering for SecureWorks. “That means we’re able to increase our research capabilities, which helps with our intelligence services and enables better protection for our clients.” By moving to the Dell | Cloudera Apache Hadoop Solution, Dell SecureWorks can put more data into its clients’ hands so they can respond faster to security threats than before.
How can IT decision makers determine the best way to capitalize on an investment in Apache Hadoop and big data initiatives? Dell has teamed up with Intel to offer the Dell | Intel Cloud Acceleration Program at Dell Solution Centers, giving decision makers a firsthand opportunity to see and test Dell big data solutions.
Experts at Dell Solution Centers located worldwide help bolster the technical skills of anyone new (and not so new) to Hadoop. Participants gain hands-on experience in a variety of areas, from optimizing performance for an application deployed on Dell servers to exploring big data solutions using Hadoop. At a Dell Solution Center, participants can attend a technical briefing with a Dell expert, investigate an architectural design workshop or build a proof of concept to comprehensively validate a big data solution and streamline deployment. Using an organization’s specific configurations and test data, participants can discover how a big data solution from Dell meets their business needs.
For more information, visit Dell Solution Centers
The expansion of the Internet of Things (IoT) has led to a proliferation of connected devices and machines with embedded sensors that generate tremendous amounts of data. To derive meaningful insights quickly from this data, organizations need interactive processing and analytics, as well as simplified ecosystems and solution stacks.
Apache Spark is poised to become the underpinning technology driving the analysis of IoT data. Spark utilizes in-memory computing to deliver high-performance data processing. It enables applications in Hadoop clusters to run up to 100 times faster than Hadoop MapReduce in memory or 10 times faster on disk. Integrated with Hadoop, Spark runs on the Hadoop YARN (Yet Another Resource Negotiator) cluster manager and is designed to read any existing Hadoop data.
Within its computing framework, Spark is tooled with analytics capabilities that support interactive query, iterative processing, streaming data and complex analytics such as machine learning and graph analytics. Because Spark combines these capabilities in a single workflow out of the box, organizations can use one tool instead of traditional specialized systems for each type of analysis, streamlining their data analytics environments.
Hadoop Solutions from Dell
Dell Big Data
By Paul Steeves
An adaptable IT infrastructure is critical in helping enterprises match specific workload requirements and keep pace with advances in computing technology.
The Dell PowerEdge FX architecture enables IT infrastructures to be constructed from small, modular blocks of computing resources that can be easily and flexibly scaled and managed.
The densely packed PowerEdge FD332 storage block allows FX-based infrastructures to rapidly and flexibly scale storage resources. The PowerEdge FD332 is a half-width, 1U module that holds up to sixteen 2.5-inch hot-plug Serial Attached SCSI (SAS) or SATA SSDs or HDDs. The PowerEdge FD332 is independently serviceable while the PowerEdge FX2 chassis is operating.
FX servers can be attached to one or more PowerEdge FD332 blocks. For each storage block, a server can either attach to all 16 drives or split access and attach to 8 drives separately.
This flexibility lets administrators combine FX servers and storage in a wide variety of configurations to address specific processing needs. For example, three PowerEdge FD332 blocks can provide up to 48 drives in a single 2U PowerEdge FX2 chassis — while leaving one half-width chassis slot to house a PowerEdge FC630 for processing. This flexibility results in 2U rack servers with massive direct-attach capacity, enabling a pay-as-you-grow IT model.
Two half-width PowerEdge FD332 storage blocks and two PowerEdge FC630 server blocks in a PowerEdge FX2 chassis
Designed to simplify cabling for the PowerEdge FX2 chassis, the FN I/O aggregators offer as much as eight-to-one cable aggregation, combining eight internal server ports into one external cable. The I/O aggregators also optimize east-west traffic communication within the chassis, helping greatly increase overall performance by accelerating virtual machine migration and significantly lowering overall latency.
The FN I/O aggregators include automated networking functions with plug-and-play simplicity that give server administrators access-layer ownership. Administrators can quickly and easily deploy a network using the simple graphical user interface of the Chassis Management Controller (CMC) or perform custom network management through a command-line interface. The I/O aggregators also are able to provide Virtual Link Trunking as well as uplink link aggregation.
The FN I/O aggregators support Data Center Bridging (DCB), Fibre Channel over Ethernet (FCoE) and Internet SCSI (iSCSI) optimization to enable converged data and storage traffic. By converging I/O through the aggregators, it is possible to eliminate redundant SAN and LAN infrastructures within the data center. As a result, cabling can be reduced up to 75 percent when connecting server blocks to upstream switches like the Dell Networking S5000 10/40GbE unified storage switch for full-fabric Fibre Channel breakout. Moreover, I/O adapters can be reduced by 50 percent.
PowerEdge FN410s. The FN I/O aggregator provides four ports of small form-factor pluggable + (SFP+) 1/10GbE connectivity and eight 10GbE internal ports. With SFP+ connectivity, the I/O aggregator supports optical and direct-attach copper (DAC) cable media.
PowerEdge FN410s: Four-port SFP+ I/O aggregator
PowerEdge FN410t. The FN I/O aggregator offers four ports of 1/10GbE 10GBASE-T connectivity and eight 10GbE internal ports. The 10GBASE-T connectivity supports cost-effective copper media with maximum transmission distance up to 328 feet (100 meters).
PowerEdge FN410t: Four-port 10GBASET-T I/O aggregator
PowerEdge FN2210s. Providing innovative flexibility for convergence within the PowerEdge FX2 chassis, the PowerEdgeFN2210s delivers up to two ports of 2/4/8 Gbps Fibre Channel bandwidth through N_Port ID Virtualization (NPIV) proxy gateway (NPG) mode, along with two SFP+ 10GbE ports. It also can be reconfigured with four SFP+10GbE ports through a reboot.
PowerEdge FN2210s: Four-port combination Fibre Channel/Ethernet I/O aggregator
NPG mode technology enables the PowerEdge FN2210s to use converged FCoE inside the PowerEdge FX2 chassis while maintaining traditional unconverged Ethernet and native Fibre Channel outside of the chassis. To converged network adapters, the PowerEdge FN2210s appears as a Fibre Channel forwarder, while its Fibre Channel ports appear as NPIV N_ports or host bus adapters to the external Fibre Channel fabric. This capability allows for connectivity upstream to a Dell Networking S5000 storage switch or many widely deployed Fibre Channel switches providing full fabric services to a SAN array; the PowerEdge FN2210s itself does not provide full Fibre Channel fabric services.
Adaptable architecture for workload customization, part 1