我有php文档signup.php,它将内容从form(在form.php文档中)保存到MySQL基础。当我要重新格式化输入内容时会出现问题。我想对à-> a这样的UTF-8字符进行解码。
$first_name=$_POST['first_name']; $last_name=$_POST['last_name']; $course=$_POST['course']; $chain="prêt-à-porter"; $pattern = array("'é'", "'è'", "'ë'", "'ê'", "'É'", "'È'", "'Ë'", "'Ê'", "'á'", "'à'", "'ä'", "'â'", "'å'", "'Á'", "'À'", "'Ä'", "'Â'", "'Å'", "'ó'", "'ò'", "'ö'", "'ô'", "'Ó'", "'Ò'", "'Ö'", "'Ô'", "'í'", "'ì'", "'ï'", "'î'", "'Í'", "'Ì'", "'Ï'", "'Î'", "'ú'", "'ù'", "'ü'", "'û'", "'Ú'", "'Ù'", "'Ü'", "'Û'", "'ý'", "'ÿ'", "'Ý'", "'ø'", "'Ø'", "'œ'", "'Œ'", "'Æ'", "'ç'", "'Ç'"); $replace = array('e', 'e', 'e', 'e', 'E', 'E', 'E', 'E', 'a', 'a', 'a', 'a', 'a', 'A', 'A', 'A', 'A', 'A', 'o', 'o', 'o', 'o', 'O', 'O', 'O', 'O', 'i', 'i', 'i', 'I', 'I', 'I', 'I', 'I', 'u', 'u', 'u', 'u', 'U', 'U', 'U', 'U', 'y', 'y', 'Y', 'o', 'O', 'a', 'A', 'A', 'c', 'C'); $chain = preg_replace($pattern, $replace, $chain); echo $chain; // print pret-a-porter $first_name = preg_replace($pattern, $replace, $first_name); echo $first_name; // does not change the input!?!
为什么它对$ chain完美起作用,但对于$ first_name或$ last_name不起作用?
我也尝试
echo $first_name; // print áááááábéééééébšššš $trans = array("á" => "a", "é" => "e", "š" => "s"); echo strtr("áááááábéééééébšššš", $trans); // print aaaaaabeeeeeebssss echo strtr($first_name,$trans); // print áááááábéééééébšššš
但是正如您所看到的,问题是相同的!
有一种更简单的方法,使用iconv-从用户说明中看来,这似乎是您想要做的:字符音译
iconv
// PHP.net User notes <?php $string = "ʿABBĀSĀBĀD"; echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $string); // output: [nothing, and you get a notice] echo iconv('UTF-8', 'ISO-8859-1//IGNORE', $string); // output: ABBSBD echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $string); // output: ABBASABAD // Yay! That's what I wanted! ?>
对字符编码要 非常谨慎 ,因此在流程的所有阶段(前端,表单提交,源文件的编码)都应保持相同的编码。PHP和格式中的默认编码为ISO-8859-1,而PHP 5.4之前的默认编码已更改为UTF8(最终!)。
您可以使用几个功能来获取想法。首先是来自CakePHP的inflector类slug:
slug
public static function slug($string, $replacement = '_') { $quotedReplacement = preg_quote($replacement, '/'); $merge = array( '/[^\s\p{Ll}\p{Lm}\p{Lo}\p{Lt}\p{Lu}\p{Nd}]/mu' => ' ', '/\\s+/' => $replacement, sprintf('/^[%s]+|[%s]+$/', $quotedReplacement, $quotedReplacement) => '', ); $map = self::$_transliteration + $merge; return preg_replace(array_keys($map), array_values($map), $string); }
它取决于一个self::$_transliteration数组,该数组与您在问题中所做的操作类似- 您可以在github上查看inflector的源代码。
self::$_transliteration
另一个是我个人使用的功能,它来自此处。
function slugify($text,$strict = false) { $text = html_entity_decode($text, ENT_QUOTES, 'UTF-8'); // replace non letter or digits by - $text = preg_replace('~[^\\pL\d.]+~u', '-', $text); // trim $text = trim($text, '-'); setlocale(LC_CTYPE, 'en_GB.utf8'); // transliterate if (function_exists('iconv')) { $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text); } // lowercase $text = strtolower($text); // remove unwanted characters $text = preg_replace('~[^-\w.]+~', '', $text); if (empty($text)) { return 'empty_$'; } if ($strict) { $text = str_replace(".", "_", $text); } return $text; }
什么这些功能做的是音译,创造“ 子弹从任意的文本输入,这是使Web应用程序时,在你的工具箱中一个非常非常有用的东西”。希望这可以帮助!