In Python 3, it is quite easy to extract text from named groups as shown in the following example:
import re
myStr = r"4000/2000/5000/7000"
reObj = re.compile(r"""(?P<g1>\d+) # a capturing group named g1
/
(?P<g2>\d+)
/
(?P<g3>\d+)
/
(?P<g4>\d+)""", re.VERBOSE)
matchObj = reObj.match(myStr) # match the string to be searched
print(matchObj.group("g1")) # 4000
print(matchObj.group("g2")) # 2000
print(matchObj.group("g3")) # 5000
print(matchObj.group("g4")) # 7000
However, in LibreOffice Calc I simply couldn't get any clue (Calc does not even have a independent regex() function which gives a regex pattern). Position-based workarounds as in this post isn't what I need.
PLEASE GIVE ANSWERS INDEPENDENT OF POSITIONAL PARAMETERS and PLEASE EXEMPLIFY EXPLICITLY. e.g. MID() is not acceptable. Although the example given here is simple enough, but I need a general way to deal with a real situation that is far more complex.
Best Answer
In both Excel and Calc, the cleanest solution is to create a general-purpose regular expression macro. To do this in Calc, go to
Tools -> Macros -> Organize Macros -> LibreOffice Basic
and add the following code to Module1:Now you can use
ReFind
for any regular expressions needed in the spreadsheet. For example, in cell A1 enter12345
. In cell B1, enter the formula=REFIND($A$1,"(\d\d)(\d)",2)
. This will retrieve the third number, which is 3.The code was adapted from https://forum.openoffice.org/en/forum/viewtopic.php?t=30502.
Note: It would be better to create the regular expression in python or java using an add-in. However that requires XML declaration files in an extension, which takes more time to set up.