Author: Robert Pound

Links: Enterprise SolutionsBusiness Intelligence and Appliance Solutions, Boomi Getting Started

Episode 4 | Dell Boomi XML to CSV Mapping

This episode covers mapping an XML file to a comma-separated value (CSV) in Dell Boomi. It builds on information covered in the previous episodes.

The Dell Business Intelligence Project Using USPTO Data project uses Dell Boomi, Dell Quickstart Data Warehouse Appliance and Toad products to analyze publicly available data. For more information about the goal and scope of the project, as well as a breakdown of the episodes, go to: episode 1.

Single XML to CSV file

In the last project, we downloaded and extracted a zip file from an HTTP connection. Building on this set of tools, we can test the mapping function in Dell Boomi. Then we use this mapping to extract useful XML data. Lastly, we can save it to a CSV file. The steps for this episode are outlined here:

  • Manually create XML
  • Modify components
  • Create XML to CSV map
    • Import XML profile
    • Create simple CSV profile
    • Map fields
  • Run Process

Many of the required components are similar to the components in the First Project. We start by copying that process.

To copy components:

Figure 1.      Copy your previous work in Component Explorer.

  1. In the Component Explorer, select the blue triangle next to First project and select Copy.
  2. In the Copy Component window, select the same folder used in the previous project. In this case Dell-REP.
  3. Clear the Copy component dependents? check box.
  4. Click OK.
  5. In Component Explorer, double-click the process named First Project 2.
  6. Rename the process to Single XML to CSV.
  7. Click Save.

 Figure 2.      Clear the Copy component dependents? check box.


Manually create XML

If the process First project ran successfully, then find the file c:\uspto\test.xml. This file is stored on the system where the process was run. This file contains over 4000 patent XML files, but for this episode you only need a single patent.

To separate out the top patent file:

  1. With Notepad, open test.xml.
    Opening this file may take at least 2 minutes because the file is 400MB.
  2. To open the find utility, press <Ctrl-F>.
  3. Search for </us-patent
  4. Highlight the line </us-patent-grant> and all rows above it in the file.
  5. Press <Ctrl-C> to copy the patent.
  6. Use Notepad to open a new file.
  7. To paste the contents of the patent into the new Notepad file, place your cursor in the Notepad file, and press <Ctrl-V>.
  8. Remove these two header tags from the file (DTD references do not work in Dell Boomi):
  1. <?xml version="1.0" encoding="UTF-8"?
  2. <!DOCTYPE us-patent-grant SYSTEM "us-patent-grant-v42-2006-08-23.dtd" [ ]>
  • Select Save as and then select All files from the dropdown list Save as type.
  • Name the file c:\uspto\single.xml and select Save.

Modifying components

Next, modify the process to reuse the components created in the previous project.

To modify components:

Figure 3.      Click Configure when it appears.

  1. In the process Single XML to CSV, hover over the Start shape and click Configure.
  2. From the drop-down menu, select Disk and Get.
  3. Next to the Connection field, select the icon and choose local uspto (created in the previous project).
  4. In the Operation field, select the (+) and choose the following:
  1. Name:                           Get single xml
  2. File Filter:                     single.xml
  3. File Matching Type:        Wildcards
  • To create the connector, select Save and Close and then OK.
  • Select Remove for the shape Unzip.
  • On filename test.xml, select Configure. Edit the Label and the Parameter to test.csv
  • Select OK.

Figure 4.      Edit the label on parameter to test.csv.


Creating the XML to CSV map

Create the two profiles that Dell Boomi uses to interpret the data: one for XML and one for CSV. Once these files are created, create a map component to link the fields in the XML to the fields in the CSV file.

To create the XML profile:

  1. Under the Build tab, in the Create Component menu, create a profile with the following values:
  1. Name:               USPTO_XML
  2. Type                 XML
  • Select Import.
  • Select XML File and click Choose File.
  • Select single.xml click OK > Next > Finish.
    1. You should see a list of elements from the patent.
  • Click Save and Close.


Creating the CSV profile

To create the CSV profile:

Figure 5.      Create these elements.

  1. Under the Build tab, in the Create Component menu, create a profile with the following values:
    1. Name:               SimpleCSV
    2. Type                 Flat
  2.  On the Options tab, set File delimiter to Comma delimited.
  3. On the Data Elements tab, click Elements.
  4. Select Add Multiple Element, select 3, and click OK.
  5. Create the elements in Figure 5 with default values with the data type Character.
  6. Click Save and Close.
  7. Drag and drop the Map shape into the process.
  8. To the right of the Map field, click the (+) symbol.
  9. Name the Map XML to CSV and select Save.
  10. On the left side of the map click the  symbol.
  11. Select XML and then select the USPTO_XML created in the previous step.
  12. Click OK.
  13. On the right side of the map, click  .
  14. Select Flat File > simpleCSV, and then click OK.
  15. Now that you have the required profiles, simply drag the elements on the left to the corresponding elements on the right. This maps the XML to the CSV file.
  16. Connect the pairs, as shown in Figure 6, and then clickSave and Close.

Figure 6.      Map shape.


Figure 7.      Drag and drop, then save and close.


Run Process

Once you save and deploy your process, you can find the test.csv file in your local directory. For directions on how to deploy a process, refer to the blog Boomi Getting Started.

Figure 8.      Contents of the test.csv file.