Extracting Structured Data from Visio – A Flexible Data Export Tool

Started by Yacine, March 16, 2025, 04:51:37 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Yacine

As a process engineer, I frequently work with functional specifications, often requiring data from Visio diagrams. Manually copying shape properties (e.g., tags, functional descriptions) into documents is error-prone and time-consuming.

To solve this, I wrote a small data extraction tool that allows users to:
  • define the data to export and their format
  • select Visio shapes
  • and get automatically a list of data to paste in Excel, a text document, or other tools

What makes it special?
Regular report tools usually export a whole set of data. To use the data, you would then need to find the right items in your list or table. This one is different, you select visually what you need on your drawing and get only data of the selected shapes.

Second feature - The export format is not hard coded, but uses a format string to extract the data.
The tool uses **placeholders in curly braces `{}`** to define how extracted data should be formatted.



The tool takes first the literal value inside the curly braces and retrieves custom cells like "prop.xyz", "user.xyz", but also standard cells like "PinX" or "LineColor".

In its implementation it tries first the literal name ("xyz"), if not found it tries "prop.xyz", if not found it tries "user.xyz". This allows for a lazy definition of the format strings.

And when the cell is still not found, there are two options. Insert an error code "#####", or remove the shape completely from the results list.

In case of ambiguities, the user can get specific by writing prop.xyz or user.xyz (e.g. "prop.LineColor")

Supported Placeholders
  • {ID} → Returns the shape's ID.
  • {prop.SomeProperty} → Extracts data from ShapeSheet "PROP" section.
  • {user.SomeUserCell} → Extracts data from ShapeSheet "USER" section.

Modifiers for Numerical Values:
  • "{prop.Weight:i}" → Interprets the value as an integer.
  • "{prop.Size:u}" → Returns the value in internal Visio units.

Escape Sequences (Special Characters)
To insert newlines or tabs into the output:
  • "\n" → Newline
  • "\t" → Tab

Items separator
A dedicated field let's you define how the results of the different shapes are separated. Default is "\n", a line break.

---

Schema Management – Save & Reuse Your Custom Formats
To avoid re-entering format strings every time, the tool supports schema management:
  • Define a format string and a separator in the tool's input fields
  • Save it as a named schema (e.g., "P&ID Extraction" or "Network Devices").
  • Select & apply the schema for future use.
  • Schemas are stored as text files (`Schemas.txt`) in `%APPDATA%\VisioCollector\`.
  • The tool provides an interface to browse, edit, rename, and delete schemas.
  • You can sort schemas in a listbox and adjust the order manually.

You cannot view this attachment.

Ready to Try?
Would this be useful for your Visio workflows? I'd love to hear your thoughts and possible improvements!

What other use cases do you see? Would you use this in your field?
Yacine

Yacine

I enhanced a little bit the parsing routines.

Added Name, NameU and Text to the list of special keywords.
E.g. {Text} gets the text of the shape or {NameU} its universal name.

If fieldName = "ID" Then
            result = shp.ID
        ElseIf fieldName = "Name" Then
            result = shp.Name
        ElseIf fieldName = "NameU" Then
            result = shp.NameU
        ElseIf fieldName = "Text" Then
            result = shp.Text
        ElseIf shp.CellExists(fieldName, False) Then
            result = GetCellValue(shp, fieldName, fieldFormat) 'literal cell name
        ElseIf shp.CellExists("prop." & fieldName, False) Then
            result = GetCellValue(shp, "prop." & fieldName, fieldFormat) ' extend to prop
        ElseIf shp.CellExists("user." & fieldName, False) Then
            result = GetCellValue(shp, "user." & fieldName, fieldFormat) ' extend to user
        Else
            ' no match for any variation of fieldName
            If checkAllFields.Value Then ' ommit the shape completely if checkbox AllFields is checked
                inspectShape = ""
                Exit Function
            End If
            result = "#####" ' else assign an error value to the field
        End If

And added Formula and FormulaU to the modifiers.
{LineColor:fu} could now be translated to THEMEGUARD(RGB(255;255;255))

    Select Case LCase(fieldFormat)
        Case "i", "int"
            GetCellValue = CStr(shp.Cells(fieldName).ResultInt(visNoCast, 0))
        Case "u", "iu"
            GetCellValue = CStr(shp.Cells(fieldName).ResultIU())
        Case "f", "formula"
            GetCellValue = shp.Cells(fieldName).Formula
        Case "fu", "formulau"
            GetCellValue = shp.Cells(fieldName).Formula
        Case Else
            GetCellValue = shp.Cells(fieldName).ResultStr("")
    End Select


Yacine

Browser ID: smf (possibly_robot)
Templates: 4: index (default), Display (default), GenericControls (default), GenericControls (default).
Sub templates: 6: init, html_above, body_above, main, body_below, html_below.
Language files: 4: index+Modifications.english (default), Post.english (default), Editor.english (default), Drafts.english (default).
Style sheets: 4: index.css, attachments.css, jquery.sceditor.css, responsive.css.
Hooks called: 259 (show)
Files included: 32 - 1207KB. (show)
Memory used: 1043KB.
Tokens: post-login.
Cache hits: 15: 0.00182s for 26,597 bytes (show)
Cache misses: 4: (show)
Queries used: 17.

[Show Queries]