OT: Python help

the_msp

Lifetime Supporting Member
Join Date
May 2008
Location
Northern Ireland
Posts
1,271
I have a text file with machine data, wanting to parse through it and extract PCB ID and date time.

The string I'm interested in is:

DataFile contents: Process= Servo Press PCBID= **************** TestTime= 22/12/2017 10:36:52 TestCell

I'm interested in the first 3 digits of the PCB ID and the TestTime

Code:
string = "DataFile contents: Process= Servo Press*TestCell"
file = open(filepath)
			for line in file:
				if line.find(string) != -1:
					pos1 = string.find(string) + 46
					pos2 = string.find(string) + 49
					pos3 = string.find(string) - 19
					pos4 = string.find(string) - 1
					PCB= string[pos1:pos2]
					DT = string[pos3:pos4]

This is my current approach but PCB ID and datetime naturally change with each part. How can I wildcard in so it starts at DataFile, ends at TestCell, and I can count from the start and end point to extract the data I want?

Or is there a better way?
 
Hmm, my first thought is to use split, but it makes some assumptions that the line in the file will have that exact format. My "test.txt" contains the exact line you quoted above:

Code:
with open('test.txt') as f:
    for line in f:
        data = line.split('=')
        pcbid = data[2].split(' ')[1][:3]
        datetime = data[3].split(' ')
        timestamp = datetime[1] + ' ' + datetime[2]
        print pcbid, timestamp

I wouldn't mind looking at an example text file if you want to PM one to me.
 
The best way I can think of to do this is with a regular expression, but if you're not familiar with them there's a bit a of learning curve.

Here's a quick and dirty example that will pull the date and time out of your example string (written for python 2):

Code:
import re

s = "DataFile contents: Process= Servo Press PCBID= **************** TestTime= 22/12/2017 10:36:52 TestCell "

date = re.search(r"[0-9]+/[0-9]+/[0-9]+ [0-9]+:[0-9]+:[0-9]+", s)

if date:
  print 'Found a date: %s' % s[date.start(0):date.end(0)]
else:
  print 'not found'
A similar pattern can be used to find the PCBID.
 
Thanks, I'll give both of these a try tomorrow morning. I was actually following a site after I posted this that was tutoring regular expressions, but no, I've never used them before so didn't get it sorted in the 20 mins I had left.
 
If you are able to, I would change the format of the data file to give yourself unique delimiters between fields. Right now it looks like you have spaces, but they can also be present in the value. What if you have a process in your data file with only one word in it, instead of "Servo Press," now your split based on number of spaces will be off for everything following the process. A unique delimiter will let you split the keys and values correctly and then you can search the fields based on the name instead of raw position, which in my opinion makes errors less likely.
 

Similar Topics

Hi, I need to create a Python script to exchange data with a PLC schneider modicon m580 via Ethernet cable, if anyone knows something and can help...
Replies
14
Views
4,514
Hello I wanna do script python ( if detect color ==> Set memory bit ) WriteMemory(plc,0,1,S7WLBit,True) if 'orange': (0, 140, 255)==1 this...
Replies
5
Views
2,645
code python: import snap7 from time import sleep import struct import snap7.client as c plc = snap7.client.Client() plc.connect("192.168.0.1")...
Replies
0
Views
1,525
When I simulated the program it gives me an error code python(part of main program): if okLeft: #random.randrange(2): resultLeft = "OK"...
Replies
2
Views
1,734
Hi all ; if someone here can help me to convert this code to python code: public static libnodave.daveOSserialType fds; public static...
Replies
11
Views
3,722
Back
Top Bottom