Skip to main content

Table 2 The average prediction time of CRF++ on single node vs Spark version

From: Recognition of bacteria named entity using conditional random fields in Spark

Data sets (The number of abstracts)

(s)

Spark version (different numbers of processor cores) (s)

12

24

36

48

2000

362.411

118.479

83.758

75.223

72.375

10,000

1716.569

533.486

325.471

286.723

268.614

20,000

3081.027

964.063

612.743

525.29

517.477

30,000

5207.298

1406.216

883.148

793.282

734.974

40,000

6141.149

1858.607

1168.061

1020.059

966.032

50,000

7956.735

2154.872

1465.193

1243.926

1191.362