← Back to Discover
apache

apache/tika

JavaApache-2.0active
90Health

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

Stars3.8k
Forks931
Open Issues60
Contributors931
Last Push0d ago

Health Breakdown

Activity
25
Community
25
Maintenance
16
Popularity
25
#content#extraction#java#metadata#tika
View on GitHub ↗Issues (60) ↗Pull Requests ↗

Should you contribute to apache/tika?

apache/tika has a FoundDev health score of 90/100, which puts it in the active-and-maintained tier. The maintainer team is shipping recently, issues are being closed, and a PR you open this week has a realistic chance of being reviewed.

Last push was 0 days ago — that signals an actively maintained project. New issues are likely to get a maintainer response within days. The project is written primarily in Java, so prior Java experience will shorten ramp-up.

Licensed under Apache-2.0, a standard OSI-approved license — safe to contribute to under normal employer IP policies.

Community

apache
apache/tika
JavaApache 2.0
90

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

active
3.8k931 contributors60 issues
0d ago

More Java repos

line
line/line-bot-sdk-java
LINE Messaging API SDK for Java
641100
GoogleCloudPlatform
GoogleCloudPlatform/java-docs-samples
Java and Kotlin Code samples used on cloud.google.com
1.9k99
apache
apache/camel
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.
6.2k98