[PHP] trim processing for multi-byte (double-byte spaces etc.) including garbled characters

Aug 26, 2020 PHP trim garbled

Overview

I had to do full-width trimming with PHP. However, when the trimming string contained garbled characters, it did not work properly when it was the method of other trim articles.

the first https://qiita.com/fallout/items/a13cebb07015d421fde3 I tried full-width trimming with reference to this article, If it contains garbled characters such as'machi Inagi’, “Fukui ” has been deleted even with garbled characters like “Fukui”.

I thought about it myself, but I could not make it beautiful

After all, I turned the loop to remove full-width and half-width spaces, but I could not write clean code. The class name CsvUtility is not cool either.

CsvUtility.php


class CsvUtility {
    /**
     * Supports full-width trim garbled characters (what an ugly code)
     * @param string $trimString
     * @return string
     */
    public static function zenTrim(string $trimString)
    {
        // https://qiita.com/fallout/items/a13cebb07015d421fde3 ↓
        //return preg_replace('/\A[\p{C}\p{Z}]++|[\p{C}\p{Z}]++\z/u', ``, $trimString) ; //'Fukui 'becomes'Fukui'
        //return trim(mb_convert_kana($trimString, "s",'UTF-8')); //'Ai' becomes'Ai'
        $spos = 0;
        for ($i = 0; $i <mb_strlen($trimString); $i++) {
            $ch = mb_substr($trimString, $i, 1);
            if ($ch !='' && $ch !=''') {
                $spos = $i;
                break;
            }
        }
        if ($spos >0) {
            $trimString = mb_substr($trimString, $spos);
        }

        $epos = 0;
        for ($i = 0; $i <mb_strlen($trimString); $i++) {
            $ch = mb_substr($trimString, mb_strlen($trimString)-$i-1, 1);
            if ($ch !='' && $ch !=''') {
                $epos = $i;
                break;
            }
        }
        if ($epos >0) {
            $trimString = mb_substr($trimString, 0, mb_strlen($trimString)-$epos);
        }

        return $trimString;

    }
}


CsvUtilityTest.php

class CsvUtilityTest extends \Tests\TestCase
{
    /** @@test */
    public function Full-width trim operation check ()
    {
        $this->assertSame(CsvUtility::zenTrim(' Kitakyushu Makoto'),'Kitakyushu Makoto');
        $this->assertSame(CsvUtility::zenTrim('Takahashi Mirai Rainbow'),'Takahashi Mirai Rainbow');
        $this->assertSame(CsvUtility::zenTrim('Harry S. Truman'),'Harry S. Truman');
        $this->assertSame(CsvUtility::zenTrim(' a'),'a');
        $this->assertSame(CsvUtility::zenTrim('a'),'a');
        $this->assertSame(CsvUtility::zenTrim('Ai'),'Ai');
        $this->assertSame(CsvUtility::zenTrim('Ai'),'Ai');
        $this->assertSame(CsvUtility::zenTrim('Honda Hiroaki'),'Honda Hiroaki');
        $this->assertSame(CsvUtility::zenTrim('Harry S. Truman'),'Harry S. Truman');
        $this->assertSame(CsvUtility::zenTrim(' Harry S. Truman'),'Harry S. Truman');
        $this->assertSame(CsvUtility::zenTrim('Tokyo Toko'Toko')','Tokyo Toko');
        $this->assertSame(CsvUtility::zenTrim('Osaka')','Osaka');
        $this->assertSame(CsvUtility::zenTrim(' machi Inagi'),'machi Inagi');
        $this->assertSame(CsvUtility::zenTrim('Rinda Geneviv'),'Rinda Geneviv');
    }
}

Although it became a green build

Testing started at 6:54 … php ./vendor/phpunit/phpunit/phpunit –configuration ./phpunit.xml –filter “/(Tests\Service\csv\CsvUtilityTest::Full-width trim operation check) (.*)?$/"- -test-suffix CsvUtilityTest.php ./tests/Service/csv –teamcity

PHPUnit 8.5.8 by Sebastian Bergmann and contributors.

Time: 490 ms, Memory: 24.00 MB

OK (1 test, 14 assertions) Process finished with exit code 0

I’m not afraid of anything.