Public Member Functions | ||||
__construct ($charsetStreamAnalyzer=null) | ||||
| ||||
analyze ($stringToAnalyze, $maxLengthAnalyzed=self::DEFAULT_MAX_ANALYZED_LENGTH) | ||||
Analyze a piece of given text and try to detect charset. | ||||
convertIfRelevant ($stringToConvert) | ||||
Convert given text to target charset if detected charset is enough relevant. | ||||
getCharsetStreamAnalyzer () | ||||
| ||||
getMinRelevance () | ||||
See documentation of CharsetDetector::setMinRelevance method. | ||||
getTargetCharset () | ||||
| ||||
setMinRelevance ($minRelevance) | ||||
Set minimal relevance. Relevance is normalized to interval <-1, 1>. 1 means sure detected, -1 means sure displaced and 0 means don't know. | ||||
setTargetCharset ($targetCharset) | ||||
Set target charset. | ||||
Static Public Member Functions | ||||
static | convert ($stringToConvert, $targetCharset=self::DEFAULT_TARGET_CHARSET) | |||
Convert string to desired charset if detected charset is enough relevant. | ||||
static | file_get_contents ($filename) | |||
Intelligent "equivalent" of file_get_contents standard PHP function. | ||||
Public Attributes | ||||
const | DEFAULT_MAX_ANALYZED_LENGTH = 65536 | |||
Default value for maxLengthAnalyzed parameter of analyze method. | ||||
const | DEFAULT_MIN_RELEVANCE = 0.1 | |||
Default value for minRlevance property. | ||||
const | DEFAULT_TARGET_CHARSET = "UTF-8" | |||
Default target charset. Can be overriden on the fly. |
This class makes inteligent wrapper around iconv library. It can guess charset of given text using static analyse. Default behavior is highly extensible - you can load configuration for different charsets or different languages. And you can do it dynamically.
Definition at line 27 of file CharsetDetector.class.php.
CharsetDetector::__construct | ( | $ | charsetStreamAnalyzer = null |
) |
charsetStreamAanlyzer | Initialized instace of analyzer. |
Use CharsetStreamAnalyzerFactory class to create such object, or omit this parameter to use default analyzer.
Definition at line 58 of file CharsetDetector.class.php.
References CharsetStreamAnalyzerFactory::createDefault().
CharsetDetector::analyze | ( | $ | stringToAnalyze, | |
$ | maxLengthAnalyzed = self::DEFAULT_MAX_ANALYZED_LENGTH | |||
) |
Analyze a piece of given text and try to detect charset.
stringToAnalyze | Text to use for analyze. | |
maxLengthAnalyzed | Maximum length of text used for analyze. |
Definition at line 77 of file CharsetDetector.class.php.
static CharsetDetector::convert | ( | $ | stringToConvert, | |
$ | targetCharset = self::DEFAULT_TARGET_CHARSET | |||
) | [static] |
Convert string to desired charset if detected charset is enough relevant.
If there is no charset detected (or not relevant), it just return the original input.
stringToConvert | String to convert - input of this method. | |
targetCharset | Charset to convert to. Using iconv's mark-up. |
Definition at line 47 of file CharsetDetector.class.php.
CharsetDetector::convertIfRelevant | ( | $ | stringToConvert | ) |
Convert given text to target charset if detected charset is enough relevant.
stringToConvert | Text to convert - input of this method. |
Definition at line 98 of file CharsetDetector.class.php.
References CharsetConverter::convertCharset().
static CharsetDetector::file_get_contents | ( | $ | filename | ) | [static] |
Intelligent "equivalent" of file_get_contents standard PHP function.
filename | Name of file to read (and recode if needed). |
Definition at line 37 of file CharsetDetector.class.php.
CharsetDetector::getMinRelevance | ( | ) |
See documentation of CharsetDetector::setMinRelevance method.
Definition at line 128 of file CharsetDetector.class.php.
CharsetDetector::setMinRelevance | ( | $ | minRelevance | ) |
Set minimal relevance. Relevance is normalized to interval <-1, 1>. 1 means sure detected, -1 means sure displaced and 0 means don't know.
minRelevance | Minimal relevance to set to. |
Definition at line 137 of file CharsetDetector.class.php.
CharsetDetector::setTargetCharset | ( | $ | targetCharset | ) |
Set target charset.
targetCharset | Target charset to set to. |
Definition at line 121 of file CharsetDetector.class.php.