How to improve filtering with Data Extractor Utility?

Hello,
I’m looking to extract accurate CALC data over a given time range from OpenHistorian archives (a damping ratio calculated by MANTRA WSU modules). These calculated data are generated on an ad hoc basis (and not at a fixed frequency such as 30 fps). This is why I think that the ‘OH Data Download’ plugin does not work under Grafana with this type of data (for example I have to specify ‘interval 0’ in the request to display them as a simple graph ; which does not seem to be supported by the download plugin).
So I tried to get them directly with the Data Extraction Utility Tool; by specifying the desired time range and ‘CALC’ category. However I can’t filter this data more accurately because the ‘Device’, ‘Points’ and ‘Summary’ tabs just show the ‘Coming soon’ message.

I save a lot of calculated data; and performing the sorting a posteriori from the csv file produced is complex because over a very short period of time I obtain files of several GB (many text editors reject those files due to their size).
Is there’s a way to have the Device and Points tabs working in order to pre-filter the data we finally want before launching the export ?

In addition : is there’s a way to have the Grafana OH Data Download plugin working with CALC data that would not be stored with a fixed framerate ?

Thank you very much for the attention and the help you could provide to this request,

Best Regards

Stephane.

Have you tried the GEPDataExtractor.exe? You can find it in the openHistorian installation folder, many people like this tool better for the finite control it provides on data extractions. Something you can try at least.

If this also does not suite your needs, we can try a more programmatic approach using a .NET and/or Python script.

Thanks,
Ritchie

Hello Ritchie,
Thank you very much for your quick reply.
GEP Data Extractor is exactly what I was searching for. I didn’t see it because it mentioned ‘GEP’ and I’m using STTP for the streams ; didn’t noticed that it could using the GEP protocol to extract data.

However, I still have a problem using the tool :
In the TIME RANGE tab I specify my time range to extract, the data type (CALC) and the desired device (WSURESULTS which is a virtual device which groups together all the data calculated by the WSU custom adapters.
In the DATA SELECTION tab, I specify that I only want to extract a precise measurement, filtered by its pointag (FILTER ActiveMeasurements WHERE POINTAG = ‘OPDC-01.001-PROD!WSU.DAMP.MODE1.DAMPRATIO’).
Finally in the SETTINGS tab I specify the following values:
image

I then launch the export in CSV and obtain an output file. The problem is that this file contains in the header line the list off all the Pointags (24978 !!) calculated values ​​of the requested Device. Effectively for those that I did not specify in the query of the ‘DATA SELECTION’ tab I have no value; and I have some for the PointTag I’m looking for. However wouldn’t it be possible to have in the CSV output file only the timestamp and PointTags that we filtered in the ‘DATA SELECTION’ tab?
A csv file with 24978 columns of data (even if the great majority is empty) is rather complicated to work with in Excel; not to mention the much larger file size than expected given the number of data sought.

Maybe there is a switch somewhere to specify including in the csv file only the data that we have filtered. If this is not the case, it could be a very interesting option to have only the ‘useful’ data in output.

Best Regards.

Stephane.

EDIT : I’ve tried with another more standard CALC value (FILTER ActiveMeasurements WHERE Device IN (‘OPDC-1.5-PROD!CHURCHTOWN’) AND SignalType IN (‘CALC’) AND PointTag = ‘RTE_OPDC-1.5-PROD!CHURCHTOWN-IA-MVA:CALC’) . The output csv file contains all the CALC data values for the specified Device ; not only the one I specified. I think that the filter request specified in the ‘Data selection’ tab is not correctly applied to the extract.
The version of GEPDataExtractor.Exe is 2.3.446.0.

Hello Ritchie,

Sorry for asking again for this improvement, but as I’m not a developper and I can’t deal with making an ad hoc version of Gep Data Extractor. Would it be possible to improve the actual version of this tool to make the output file keeping only in the first line (header) the measurements that are specificaly requiered by the DATA SELECTION tab query ?

This improvement would be great because my Data Selection Tab query target a fiew specific CALC measurements ; and my Measurement table contains more thant a thousand ‘types’ of that calc measurements (based on the pointag name).

Once again, thank you very much for the provided support,

Have a nice day,

Stephane.

Hey there Stephane,

Didn’t see these messages come in, so my apologies for the long delay in responding.

I will look into adding your feature request.

Thanks,
Ritchie

1 Like

Hello Ritchie,

Could you tell me if this development is still planned please?

On this subject of GEPDataExtractor tool, I also want to make it available on a ‘simple’ workstation in addition to the one automatically installed on the server hosting OpenHistorian (I don’t want ‘simple’ users to connect to the server for the realization of a data extraction action). To do this I have identified with ProcMon that on this user workstation (on which openHistorian is not installed) I must copy the following files:

  • GEPDataExtractor.exe
  • GEPDataExtractor.exe.Config
  • GSF.Communication.dll
  • GSF.Core.dll
  • GSF.TimeSeries.dll
  • GSF.Windows.dll
  • Antlr3.Runtime.dll
  • ExpressionEvaluator.dll
    Then I can access from this workstation to the openHistorian server by opening the incoming port TCP 6175 on the server (and only this port).

Could you please confirm :

  • that this list of files is quite exhaustive (enough for the application to work correctly in ‘standalone’ mode)? According to my tests it is indeed the case but I prefer to make sure.
  • I only have to open the incoming port 6175 on the OpenHistorian server to make it work (DataGepExtractor don’t need another port to be open to grab the data from the server once it is contacted)? I tried to verify this point with the netstat command but not sure that the results are exhaustive.

Thank you very much for your answer,
Wish you a nice day,

Regards.

Stephane.

Yes, this development is still in the queue - been a little busy here lately.

Your file list is probably safe, however, here’s the full list of possible required files:

  • GEPDataExtractor.exe
  • GEPDataExtractor.exe.Config
  • Antlr3.Runtime.dll
  • ExpressionEvaluator.dll
  • GSF.Communication.dll
  • GSF.Core.dll
  • GSF.Net.dll
  • GSF.Security.dll
  • GSF.ServiceProcess.dll
  • GSF.TimeSeries.dll
  • GSF.Windows.dll
  • Newtonsoft.Json.dll

Port TCP 6175 should be all that is required.

Thanks!
Ritchie

1 Like