最近我看到了一个关于Google Guava的精彩演讲 ,我们在我们的项目中得出结论,使用它的缓存功能真的很有趣。 让我们看一下regexp Pattern类及其编译功能 。 在代码中经常可以看到,每次使用正则表达式时,程序员都会使用相同的参数重复调用上述Pattern.compile()函数,从而一次又一次地编译相同的正则表达式。 但是,可以做的是缓存此类编译的结果–让我们看一下RegexpUtils实用程序类:
RegexpUtils.java
package pl.grzejszczak.marcin.guava.cache.utils;import com.google.common.cache.CacheBuilder;
import com.google.common.cache.CacheLoader;
import com.google.common.cache.LoadingCache;import java.util.concurrent.ExecutionException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;import static java.lang.String.format;public final class RegexpUtils {private RegexpUtils() {throw new UnsupportedOperationException("RegexpUtils is a utility class - don't instantiate it!");}private static final LoadingCache<String, Pattern> COMPILED_PATTERNS =CacheBuilder.newBuilder().build(new CacheLoader<String, Pattern>() {@Overridepublic Pattern load(String regexp) throws Exception {return Pattern.compile(regexp);}});public static Pattern getPattern(String regexp) {try {return COMPILED_PATTERNS.get(regexp);} catch (ExecutionException e) {throw new RuntimeException(format("Error when getting a pattern [%s] from cache", regexp), e);}}public static boolean matches(String stringToCheck, String regexp) {return doGetMatcher(stringToCheck, regexp).matches();}public static Matcher getMatcher(String stringToCheck, String regexp) {return doGetMatcher(stringToCheck, regexp);}private static Matcher doGetMatcher(String stringToCheck, String regexp) {Pattern pattern = getPattern(regexp);return pattern.matcher(stringToCheck);}}
如您所见,如果没有找到,则使用带有CacheBuilder的Guava的LoadingCache来填充具有新编译模式的缓存。 由于缓存已编译的模式(如果已经进行了编译),将不会再次重复(在我们的情况下,因为我们没有任何到期设置)。 现在一个简单的测试
GuavaCache.java
package pl.grzejszczak.marcin.guava.cache;import com.google.common.base.Stopwatch;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import pl.grzejszczak.marcin.guava.cache.utils.RegexpUtils;import java.util.regex.Pattern;import static java.lang.String.format;public class GuavaCache {private static final Logger LOGGER = LoggerFactory.getLogger(GuavaCache.class);public static final String STRING_TO_MATCH = "something";public static void main(String[] args) {runTestForManualCompilationAndOneUsingCache(1);runTestForManualCompilationAndOneUsingCache(10);runTestForManualCompilationAndOneUsingCache(100);runTestForManualCompilationAndOneUsingCache(1000);runTestForManualCompilationAndOneUsingCache(10000);runTestForManualCompilationAndOneUsingCache(100000);runTestForManualCompilationAndOneUsingCache(1000000);}private static void runTestForManualCompilationAndOneUsingCache(int firstNoOfRepetitions) {repeatManualCompilation(firstNoOfRepetitions);repeatCompilationWithCache(firstNoOfRepetitions);}private static void repeatManualCompilation(int noOfRepetitions) {Stopwatch stopwatch = new Stopwatch().start();compileAndMatchPatternManually(noOfRepetitions);LOGGER.debug(format("Time needed to compile and check regexp expression [%d] ms, no of iterations [%d]", stopwatch.elapsedMillis(), noOfRepetitions));}private static void repeatCompilationWithCache(int noOfRepetitions) {Stopwatch stopwatch = new Stopwatch().start();compileAndMatchPatternUsingCache(noOfRepetitions);LOGGER.debug(format("Time needed to compile and check regexp expression using Cache [%d] ms, no of iterations [%d]", stopwatch.elapsedMillis(), noOfRepetitions));}private static void compileAndMatchPatternManually(int limit) {for (int i = 0; i < limit; i++) {Pattern.compile("something").matcher(STRING_TO_MATCH).matches();Pattern.compile("something1").matcher(STRING_TO_MATCH).matches();Pattern.compile("something2").matcher(STRING_TO_MATCH).matches();Pattern.compile("something3").matcher(STRING_TO_MATCH).matches();Pattern.compile("something4").matcher(STRING_TO_MATCH).matches();Pattern.compile("something5").matcher(STRING_TO_MATCH).matches();Pattern.compile("something6").matcher(STRING_TO_MATCH).matches();Pattern.compile("something7").matcher(STRING_TO_MATCH).matches();Pattern.compile("something8").matcher(STRING_TO_MATCH).matches();Pattern.compile("something9").matcher(STRING_TO_MATCH).matches();}}private static void compileAndMatchPatternUsingCache(int limit) {for (int i = 0; i < limit; i++) {RegexpUtils.matches(STRING_TO_MATCH, "something");RegexpUtils.matches(STRING_TO_MATCH, "something1");RegexpUtils.matches(STRING_TO_MATCH, "something2");RegexpUtils.matches(STRING_TO_MATCH, "something3");RegexpUtils.matches(STRING_TO_MATCH, "something4");RegexpUtils.matches(STRING_TO_MATCH, "something5");RegexpUtils.matches(STRING_TO_MATCH, "something6");RegexpUtils.matches(STRING_TO_MATCH, "something7");RegexpUtils.matches(STRING_TO_MATCH, "something8");RegexpUtils.matches(STRING_TO_MATCH, "something9");}}}
我们正在运行一系列测试,并检查它们的执行时间。 请注意,由于应用程序不是独立运行的,因此这些测试的结果并不精确,因此许多条件都可能影响执行时间。 我们有兴趣显示一定程度的问题,而不是显示准确的执行时间。 对于给定的迭代次数(1,10,100,1000,10000,100000,1000000),我们要么编译10个正则表达式,要么使用Guava的缓存检索已编译的Pattern,然后将它们与要匹配的字符串进行匹配。 这些是日志:
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [1] ms, no of iterations [1]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [35] ms, no of iterations [1]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [1] ms, no of iterations [10]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [0] ms, no of iterations [10]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [8] ms, no of iterations [100]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [3] ms, no of iterations [100]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [10] ms, no of iterations [1000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [10] ms, no of iterations [1000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [83] ms, no of iterations [10000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [33] ms, no of iterations [10000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [800] ms, no of iterations [100000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [279] ms, no of iterations [100000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:34 Time needed to compile and check regexp expression [7562] ms, no of iterations [1000000]
pl.grzejszczak.marcin.guava.cache.GuavaCache:40 Time needed to compile and check regexp expression using Cache [3067] ms, no of iterations [1000000]
您可以在此处在Guava / Cache目录下找到源,或转到URL https://bitbucket.org/gregorin1987/too-much-coding/src
翻译自: https://www.javacodegeeks.com/2013/04/google-guava-cache-with-regular-expression-patterns.html