stringi: THE string processing package for R

Description

Fork me on GitHub

stringi (pronounced “stringy”, IPA [strinɡi]) is THE R package for fast, correct, consistent and convenient string/text processing in each locale and any native character encoding. The use of the ICU library gives R users a platform-independent set of functions known to Java, Perl, Python, PHP, and Ruby programmers.

License: The BSD 3-Clause License (see the LICENSE file for details).

Obtaining stringi

  1. Fetch & install the latest official CRAN release by calling:
    install.packages('stringi')
    

    You are encouraged to call stri_install_check() after install to check whether the ICU library has been successfully installed and whether ICU has detected your native character encoding and locale properly.

  2. The current state-of-the-art development version of stringi is available on GitHub. (Windows users shall of course install Rtools and git. OS X users should have Xcode installed. Package tests are available in the devel/testthat/ directory.)
    library('devtools') # call install.packages('devtools') first
    install_github('Rexamine/stringi')
    

    Refer to the NEWS file for a complete list of changes.

Requirements

System requirements: R ≥ 2.15 and optionally ICU4C ≥ 50.

See the INSTALL file for more details.

Documentation

  1. Browse on-line manual [for a quite recent development version]
  2. Compatibility Tables – Compare the functionality provided by base R, stringr, and stringi:

Other Resources

  1. CRAN entry for stringi
  2. Issues tracker on GitHub
  3. Browse sources/contribute on GitHub
  4. ICU – International Components for Unicode
  5. ICU4C API documentation
  6. R project homepage

Authors

Acknowledgments

The contributions of Marcin Bujarski at the early stage of the package development is fully acknowledged.

Moreover, the help of Giovanni Mazzocco in building the first OS X version of stringi is much appreciated.

The package’s API was inspired by Hadley Wickham’s stringr package. Thanks.

The authors also wish to thank their colleagues (Systems Research Institute, Polish Academy of Sciences) and students (Faculty of Mathematics and Information Science, Warsaw University of Technology) for valuable comments on and extensive testing of the package facilities.