We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
Imagine you need to check if a string is smaller than a given byte size. For example, you need to check if a string is smaller than 100 bytes.
You can use the strlen
function to check the string size in bytes. The strlen
function returns the number of bytes and not the number of characters. So, if you have a string with 100 characters, it will return 100, but if you have a string with 100 characters and 100 emojis, it will return 400.
If you want to count the number of characters, counting multi-byte characters (such as emoji's) as single characters, you need to use the mb_strlen
function instead.
An example:
// Hello in Arabic
$utf8 = "السلام علیکم ورحمة الله وبرکاته!";
strlen($utf8) // 59
mb_strlen($utf8, 'utf8') // 32
If you happen to use the Str::length()
method in Laravel, be aware that it uses the mb_strlen
function under the hood.
/**
* Return the length of the given string.
*
* @param string $value
* @param string|null $encoding
* @return int
*/
public static function length($value, $encoding = null)
{
return mb_strlen($value, $encoding);
}
The same applies to for example the Str::limit
method in Laravel:
/**
* Limit the number of characters in a string.
*
* @param string $value
* @param int $limit
* @param string $end
* @return string
*/
public static function limit($value, $limit = 100, $end = '...')
{
if (mb_strwidth($value, 'UTF-8') <= $limit) {
return $value;
}
return rtrim(mb_strimwidth($value, 0, $limit, '', 'UTF-8')).$end;
}
So, if you want to truncate a string to a maximum byte size using PHP, I ended up with the following method:
function truncateToMaxSize(string $inputString, int $maxSizeInMB, string $encoding = 'UTF-8') : string {
$maxSizeInBytes = $maxSizeInMB * 1024 * 1024; // Convert MB to bytes
if (strlen($inputString) <= $maxSizeInBytes) {
return $inputString; // No need to truncate
}
$truncatedString = substr($inputString, 0, $maxSizeInBytes);
// To ensure that you don't cut off a multi-byte character at the end
$truncatedString = mb_substr(
$truncatedString,
0,
mb_strlen($truncatedString, $encoding),
$encoding,
);
return $truncatedString;
}
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.