[PHP] trim processing for multi-byte (double-byte spaces etc.) including garbled characters
Overview
I had to do full-width trimming with PHP. However, when the trimming string contained garbled characters, it did not work properly when it was the method of other trim articles.
the first https://qiita.com/fallout/items/a13cebb07015d421fde3 I tried full-width trimming with reference to this article, If it contains garbled characters such as'machi Inagi’, “Fukui ” has been deleted even with garbled characters like “Fukui”.
I thought about it myself, but I could not make it beautiful
After all, I turned the loop to remove full-width and half-width spaces, but I could not write clean code. The class name CsvUtility is not cool either.
CsvUtility.php
class CsvUtility {
/**
* Supports full-width trim garbled characters (what an ugly code)
* @param string $trimString
* @return string
*/
public static function zenTrim(string $trimString)
{
// https://qiita.com/fallout/items/a13cebb07015d421fde3 ↓
//return preg_replace('/\A[\p{C}\p{Z}]++|[\p{C}\p{Z}]++\z/u', ``, $trimString) ; //'Fukui 'becomes'Fukui'
//return trim(mb_convert_kana($trimString, "s",'UTF-8')); //'Ai' becomes'Ai'
$spos = 0;
for ($i = 0; $i <mb_strlen($trimString); $i++) {
$ch = mb_substr($trimString, $i, 1);
if ($ch !='' && $ch !=''') {
$spos = $i;
break;
}
}
if ($spos >0) {
$trimString = mb_substr($trimString, $spos);
}
$epos = 0;
for ($i = 0; $i <mb_strlen($trimString); $i++) {
$ch = mb_substr($trimString, mb_strlen($trimString)-$i-1, 1);
if ($ch !='' && $ch !=''') {
$epos = $i;
break;
}
}
if ($epos >0) {
$trimString = mb_substr($trimString, 0, mb_strlen($trimString)-$epos);
}
return $trimString;
}
}
CsvUtilityTest.php
class CsvUtilityTest extends \Tests\TestCase
{
/** @@test */
public function Full-width trim operation check ()
{
$this->assertSame(CsvUtility::zenTrim(' Kitakyushu Makoto'),'Kitakyushu Makoto');
$this->assertSame(CsvUtility::zenTrim('Takahashi Mirai Rainbow'),'Takahashi Mirai Rainbow');
$this->assertSame(CsvUtility::zenTrim('Harry S. Truman'),'Harry S. Truman');
$this->assertSame(CsvUtility::zenTrim(' a'),'a');
$this->assertSame(CsvUtility::zenTrim('a'),'a');
$this->assertSame(CsvUtility::zenTrim('Ai'),'Ai');
$this->assertSame(CsvUtility::zenTrim('Ai'),'Ai');
$this->assertSame(CsvUtility::zenTrim('Honda Hiroaki'),'Honda Hiroaki');
$this->assertSame(CsvUtility::zenTrim('Harry S. Truman'),'Harry S. Truman');
$this->assertSame(CsvUtility::zenTrim(' Harry S. Truman'),'Harry S. Truman');
$this->assertSame(CsvUtility::zenTrim('Tokyo Toko'Toko')','Tokyo Toko');
$this->assertSame(CsvUtility::zenTrim('Osaka')','Osaka');
$this->assertSame(CsvUtility::zenTrim(' machi Inagi'),'machi Inagi');
$this->assertSame(CsvUtility::zenTrim('Rinda Geneviv'),'Rinda Geneviv');
}
}
Although it became a green build
Testing started at 6:54 … php ./vendor/phpunit/phpunit/phpunit –configuration ./phpunit.xml –filter “/(Tests\Service\csv\CsvUtilityTest::Full-width trim operation check) (.*)?$/"- -test-suffix CsvUtilityTest.php ./tests/Service/csv –teamcity
PHPUnit 8.5.8 by Sebastian Bergmann and contributors.
Time: 490 ms, Memory: 24.00 MB
OK (1 test, 14 assertions) Process finished with exit code 0
I’m not afraid of anything.