Blog Archives

stringi 1.1.1 released

stringi is among the top 10 most downloaded R packages, providing various string processing facilities. A new release comes with a few bugfixes and new features. * [BUGFIX] #214: allow a regex pattern like `.*` to match an empty string.

Tagged with: , ,
Posted in Blog/News, Blog/R

stringi 1.0-1

First of all, we’re happy to announce that since the 1.0.0 release, the stringr package for R is now powered by stringi. For more details, read more here. Also please note that the stringi package version 1.0-1 is now on

Tagged with: , ,
Posted in Blog/News, Blog/R

Speeding up R packages’ installation process

There is a time for some things, and a time for all things; a time for great things, and a time for small things — Miguel de Cervantes Building R packages from sources may take a long time, especially if

Tagged with: , , ,
Posted in Blog/R, Blog/R-bloggers

Pull the (character) strings with stringi 0.5-2

A reliable string processing toolkit is a must-have for any data scientist. A new release of the stringi package is available on CRAN (please wait a few days for Windows and OS X binary builds). As for now, about 850

Tagged with:
Posted in Blog/News, Blog/R, Blog/R-bloggers

Using Hadoop Streaming API to perform a word count job in R and C++

by Marek Gagolewski, Maciej Bartoszuk, Anna Cena, and Jan Lasek (Rexamine). Introduction In a recent blog post we explained how we managed to set up a working Hadoop environment on a few CentOS7 machines. To test the installation, let’s play

Tagged with: , , , , ,
Posted in Blog/Hadoop, Blog/R, Blog/R-bloggers

Installing Hadoop 2.6.0 on CentOS 7

by Marek Gagolewski, Maciej Bartoszuk, Anna Cena, and Jan Lasek (Rexamine). Configuring a working Hadoop 2.6.0 environment on CentOS 7 is a bit of a struggle. Here are the steps we made to set everything up so that we have

Tagged with: , , ,
Posted in Blog/Hadoop

stringi 0.4-1 released – fast, portable, consistent character string processing

A new release of the stringi package is available on CRAN (please wait a few days for Windows and OS X binary builds). # install.packages("stringi") or update.packages() library("stringi") Here’s a list of changes in version 0.4-1. In the current release,

Tagged with: , , ,
Posted in Blog/News, Blog/R, Blog/R-bloggers

Faster, easier, and more reliable character string processing with stringi 0.3-1

A new release of the stringi package is available on CRAN (please wait a few days for Windows and OS X binary builds). # install.packages("stringi") or update.packages() library("stringi") stringi is an R package providing (but definitely not limiting to) equivalents

Tagged with: , , , , , ,
Posted in Blog/R, Blog/R-bloggers

ICU Unicode text transforms in the R package stringi

The ICU (International Components for Unicode) library provides very powerful and flexible ways to apply various Unicode text transforms. These include: Full (language-specific) case mappings, Unicode normalization, Text transliteration (e.g. script-to-script conversion). All of these are available to R programmers/users

Tagged with: , , , ,
Posted in Blog/R, Blog/R-bloggers

Counting the number of words in a LaTeX file with stringi

In my recent post I promised to present the most interesting features of the stringi package in more detail. Here's one of such jolly features. Many LaTeX users may find it very useful. Loading a text file with encoding auto-detection

Tagged with: , , , ,
Posted in Blog/LaTeX, Blog/R, Blog/R-bloggers