I have a text like this:
81. Text1
82. Text2
82.1. Some text3
82.2. Some long text goes there in two or more lines... Some more text goes here...
83. Text4
84. Text5
It has some random spacing between the lines. I'm trying to extract every single option separately. So for example my output for 82.2. should be like this: "82.2." and "Some long text goes there in two or more lines... Some more text goes here...".
I've already tried to do this like that:
$exp = explode(". ", $text);
foreach($exp as $newline) {
echo explode(". ", $newline)[0];
}
But probably that's not the best idea, because sometimes there's an ". " in the end of sentence.
CodePudding user response:
You're on the right track making use of explode:
$output = [];
$input = '81. Text1
82. Text2
82.1. Some text3
82.2. Some long text goes there in two or more lines... Some more text goes here...
83. Text4
84. Text5';
// split lines, trim any whitespace on each line and remove any that are empty
// PHP_EOL may need to be changed to how newlines are encoded in the text file
$lines = array_filter(array_map('trim', explode(PHP_EOL, $input)));
foreach ($lines as $line) {
$split = explode('. ', $line);
// The number will be the first element
$number = trim(array_shift($split));
// Join the rest of the elements together
$text = implode('', $split);
$output[] = [
'number' => $number,
'text' => $text
];
}
var_dump($output);
This yields:
array(6) {
[0]=>
array(2) {
["number"]=>
string(2) "81"
["text"]=>
string(5) "Text1"
}
[1]=>
array(2) {
["number"]=>
string(2) "82"
["text"]=>
string(5) "Text2"
}
[2]=>
array(2) {
["number"]=>
string(4) "82.1"
["text"]=>
string(10) "Some text3"
}
[3]=>
array(2) {
["number"]=>
string(4) "82.2"
["text"]=>
string(75) "Some long text goes there in two or more lines..Some more text goes here..."
}
[4]=>
array(2) {
["number"]=>
string(2) "83"
["text"]=>
string(5) "Text4"
}
[5]=>
array(2) {
["number"]=>
string(2) "84"
["text"]=>
string(5) "Text5"
}
}
CodePudding user response:
You can use the limit parameter of the explode function to only get two results:
$str = <<<EOD
81. Text1
82. Text2
82.1. Some text3
82.2. Some long text goes there in two or more lines... Some more text goes here...
83. Text4
84. Text5
EOD;
foreach (explode("\n", $str) as $line) {
if (trim($line) == "") {
continue;
}
list($prefix, $text) = explode(" ", $line, 2);
echo $prefix . " -> " . $text . "\n";
}
This prints:
81. -> Text1
82. -> Text2
82.1. -> Some text3
82.2. -> Some long text goes there in two or more lines... Some more text goes here...
83. -> Text4
84. -> Text5
CodePudding user response:
You can use a simple multiline regex to split the text and finish this in just 2 lines(concise code).
- Match all digits and period character from the start. Capture them in a group.
^([\d.] ) - Match the rest of the string in another group.
(.*)$. - Now, use
preg_match_allto match all of those lines and pass an array as a third parameter to store those matches. (say$matches). - Use
array_mapto merge captured groups1and2.
Snippet:
<?php
preg_match_all('/^([\d.] )(.*)$/m', $str, $matches);
$result = array_map(fn($v1, $v2) => [ $v1, $v2] , $matches[1], $matches[2]);
print_r($result);
