Introduction to WEKA

Maya Cendana
3 min readNov 28, 2022

--

WEKA (Waikato Environment for Knowledge Analysis) implements learning algorithms that can be easily applied to the dataset. Currently, the latest stable version is Weka 3.8, and the development version is Weka 3.9. The download link for both versions is here.

Figure 1. WEKA main windows

References

And here are some references:

Figure 2. WEKA references

Public Dataset: UCI ML Repository

Actually, WEKA already provided the datasets in the folder <data>. Still, popular public datasets, such as UCI ML Repository, can also be used to find the various datasets, either in the old or the new/beta versions.

Figure 3. UCI ML repository

The Package Management System

Besides the default algorithm already installed in <the explorer>, WEKA also has the feature “package manager,” which can be used to search and install the other algorithms.

Figure 4. The package management system

Simple Example

The first step, choose the <open file> from <the explorer>, as shown in Figure 5.

Figure 5. The Explorer

The second step is preparing the data. For example, choose <weather.numeric> from the folder <data>. WEKA uses the .arff as the default data format but can also read the .csv file.

Figure 7. Preparing the data

After loading the weather data into <the explorer>, the visualization for the dataset can be seen in Figure 8.

Figure 8. Loading weather data into the explorer

WEKA also provides an unbalanced dataset as the sample dataset, as seen in Figure 9.

Figure 9. Example of the unbalance dataset

The third step is to build the model. For example, the CART (classification and regression tree) algorithm can classify the weather dataset. Figure 10 shows the classifier output: accuracy and confusion matrix.

Figure 10. CART algorithm

Another method to build a decision tree is using the J48 algorithm. This algorithm is the improved version of the C45 algorithm. As a result, as shown in Figure 11, the J48 algorithm has better accuracy than the CART algorithm, which is 64.29%.

Figure 11. J48 algorithm

Thank you. Hopefully, this article can provide a simple explanation of how to get started with WEKA.

--

--