Android 平台一个使用 FileIOStream 替代 RandomAccessFile.rwUTF 引发的 bad base-64 隐藏 BUG

倒霉蛋小

163人浏览 · 2026-07-01 17:06:01

倒霉蛋小 · 2026-07-01 17:06:01 发布

抓紧看代码，文件的内容只是 Base64.encode(AES(raw)) 了一个配置表，两个 SDK 版本间唯一的变动就是把读写文件从RandomAccessFile#readUTF() / RandomAccessFile#writeUTF()替换成了FileInputStream#read()/FileOutputStream#write()，可是这又有什么问题呢？

背景：为什么要升级文件读写方式

应用中加密配置文件的持久化早期采用了RandomAccessFile.writeUTF() / readUTF()，当未来写入的内容大于 64KB 时，会抛出 UTFDataFormatException 错误。

为了解决这个问题，新版本改用了 FileI/OStream 读写：

// 旧版本写入
public static boolean write(String fileName, String value) {
    try (RandomAccessFile accessFile = new RandomAccessFile(file, "rw")) {
        accessFile.setLength(0);
        accessFile.writeUTF(Base64.encodeToString(value, Base64.NO_WRAP));
    }
}

// 新版本写入
public static boolean write(String fileName, String value) {
    try (FileOutputStream fos = new FileOutputStream(getFile(fileName))) {
        fos.write(Base64.encodeToString(value, Base64.NO_WRAP).getBytes(StandardCharsets.UTF_8));
        return true;
    } catch (IOException e) {
        return false;
    }
}

读取逻辑也相应简化，直接读入全部字节并转为字符串，再送入解密流程：

// 新版本读取
public static String readOrCreate(String fileName) {
    File file = getFile(fileName);
    try (FileInputStream fis = new FileInputStream(file)) {
        byte[] data = new byte[(int) file.length()];
        fis.read(data);
        return new String(Base64.decode(new String(data, StandardCharsets.UTF_8), Base64.NO_WRAP));
    } catch (IOException e) {
        return null;
    }
}

测试正常，灰度阶段突然一个客户反馈覆盖安装升级后初始化失败，返回本地文件读取失败，旧版本正常、新版本重新安装正常，只有覆盖升级时必现这个错误，抓紧要来了客户的 APK 包开始调试

初步排查与修复

在测试机上替换了增加调试信息的 dex 以后，初步定位问题出现在读取文件时 Base64.decode 抛出了错误:

java.lang.IllegalArgumentException: bad base-64
    at android.util.Base64.decode(Base64.java:...)

诡异的是，启动流程中依次读取的三个文件——按字母顺序倒序命名为GAMMA、BETA、ALPHA——只有最后一个ALPHA抛出异常，前两个文件读取解密均正确。

三个文件均为旧版本写入，为何独独ALPHA报错？

第一轮排查：聚焦 Base64 编码与文件格式

最开始我怀疑是Base64.encodeToString()方法的编码，new String时我都是配置的charset=StandardCharsets.UTF_8，但android.util.Base64#encodeToString的实现是:

    public static String encodeToString(byte[] input, int flags) {
        try {
            return new String(encode(input, flags), "US-ASCII");
        } catch (UnsupportedEncodingException e) {
            // US-ASCII is guaranteed to be available.
            throw new AssertionError(e);
        }
    }

但查阅源码确认其输出严格限定为US-ASCII字符集（A-Za-z0-9+/=），不可能引入非法字符。

那只能把这个文件单独 adb pull 出来，和使用新版本重新写入的通用原文的文件进行比对，比对后发现前端多了一个特殊符号：

比对两个版本写入文件的差异

于是去请教 AI，才发现我粗鲁的升级时留下了隐患：writeUTF()时写入内容的前两个字节以大端序表示后续 UTF-8 数据的字节长度，我的文件前两个字节分别是0x03、0x58，0x0358 = 856 = 858-2，文件大小刚好是 858 字节。放回到程序里验证:

// 新版本读取-验证
public static String readOrCreate(String fileName) {
    File file = getFile(fileName);
    try (FileInputStream fis = new FileInputStream(file)) {
        byte[] data = new byte[(int) file.length()];
        
        /* append */
        int readCount = fis.read(data);
        if (readCount <= 0) {
            return null;
        }
        String cipherText;
        // 判断是否为旧版本 writeUTF 写入的数据（前两个字节可能标识长度）
        // 一个简单有效的判断：如果前两个字节组成的长度 + 2 等于文件总长度，则为旧格式
        if (readCount >= 2) {
            int utfLen = ((data[0] & 0xFF) << 8) | (data[1] & 0xFF);
            if (utfLen + 2 == readCount) {
                // 旧格式，跳过前两个字节
                cipherText = new String(data, 2, utfLen, StandardCharsets.UTF_8);
            } else {
                // 新格式，使用全部字节
                cipherText = new String(data, 0, readCount, StandardCharsets.UTF_8);
            }
        } else {
            cipherText = new String(data, 0, readCount, StandardCharsets.UTF_8);
        }
        return new String(Base64.decode(cipherText, Base64.NO_WRAP));
    } catch (IOException e) {
        return null;
    }
}

果然在判断为旧格式时，直接跳过前两个字节以后，解析到的文本就正常了。

具体对应在我的场景下，识别到旧格式后return null让外部重新获取数据并通过新方法式写入。

马上提测、发布、推送给客户那边解决问题。可是。。。为什么只是 ALPHA，为什么之前的测试没有出现这个问题？

知其然，还要知其所以然

为什么三个文件里只有这一个文件是错误的，为什么只有这一个客户的配置文件才出现这个错误，这不是简单的 writeUTF 和 stream.read 的差异能解释的。

和 AI 的 Battle

经过了多轮的对话，我终于让 AI 理解了文件格式、写入方法是统一的，而且只有这一个文件是错误的现实。

AI 错误的推论包括：

全部三个文件都会报错，因为 Base64 合法字符的范围不匹配
只有 alpha 文件是真正的writeUTF写入，其他文件根本没有长度头（是原始字节流）
长度头碰巧与数据内容相关，但不是writeUTF写入的
Base64 解码的字符集检查顺序问题
真相揭示：你的代码根本没有对这些文件进行 Base64 解码！

这期间我已经提供了实现方法、原始文件、观察现象，但 AI 还是会坚定的认为我的观察是错误的，还会自以为是的提供它的证据，然后左右脑互搏：

alpha:   0x03 0x58 ... → 字符串开头: '\u0003' 'X' ...
beta:    0x08 0x2C ... → 字符串开头: '\b' ',' ...
gamma:   0x00 0x18 ... → 字符串开头: '\u0000' '\u0018' ...

它们的高位字节全都是非法 Base64 字符，理论上都应该立即报错！

唯一的解释：Base64 解码器对控制字符的处理不同

查阅 Android 源码，android.util.Base64.decode 的实现有两种模式：
- 严格模式：遇到非法字符立即抛异常
- 宽松模式：跳过空白字符和控制字符

但是！Android 的 Base64.decode(String, int) 在某些版本中会自动跳过 ASCII 码小于 0x20 的控制字符！
让我们验证：
\0 (0x00)	0	✅ 在某些实现中被当作空白跳过
\u0001 (0x01)	1	✅ 在某些实现中被当作空白跳过
\u0003 (0x03)	3	❌ 不被跳过，触发错误
\b (0x08)	8	❌ 不被跳过，但...

不得不说，deepseek 新的专家模式（有人猜是 v4-lite）这个拟人化还挺可爱的，尤其是这个等等，也跟着学习了不少思路：

ALPHA.txt（858字节，会报错）
开头字节: 0x03 0x58 0x56 0x55 0x35 0x66 0x66...
         ↑     ↑     ↑
         长度头  V     U    5    f    f

0x03 是控制字符 ETX
后续是标准 Base64 字符串 VU5ffMqb...
Base64 解码器看到 0x03 立即报错 ✅

BETA.txt（2094字节，未报错）
开头字节: 0x08 0x2C 0x44 0x63 0x55 0x73...
               ↑     ↑
               ,     D    c    U    s

关键发现：0x2C 是 逗号 ,，这不是合法的 Base64 字符！
等等，如果 0x2C 也不是合法 Base64 字符，为什么 BETA 没有报错？
让我检查 BETA 的完整前几个字节：0x08, 0x2C, 0x44, 0x63...
Base64 合法字符检查：
, (0x2C) - ❌ 不是合法 Base64
D (0x44) - ✅ 合法
c (0x63) - ✅ 合法

既然开头就有非法字符 ,，为什么没报错？

这中间，终于找到了一个可能的方向：

Android Base64 解码器会跳过某些控制字符，但不是全部！
为什么不同控制字符表现不同？
U+0003 (ETX) 恰好不在 Android Base64 解码器的"自动跳过"列表中！

实际上，BETA 解密后的内容虽然显示为 JSON，但前几个字符可能已经损坏

ALPHA 报错的直接原因：
    第一个字符 U+0003 不在 Android Base64 解码器的跳过列表中
    解码器立即抛出 bad-base64 异常

BETA/GAMMA 没报错的原因：
    它们的第一个控制字符（U+000B、U+0001、U+0000）在跳过列表中
    Base64 解码器跳过后继续处理剩余字符，"成功"解码（但实际数据已损坏）

数据完整性：
    三个文件的数据全部已损坏！
    ALPHA 因为异常而暴露问题
    其他三个文件"侥幸"通过解密，但解密结果是错误的

模拟 Android 环境

把三个文件交个 AI尊师分析，为了提高效率，把 Android-36 的 Base64 源码复制出来，在 Idea 中复现和调试，结果和预期一样：

ALPHA   -> ❌ bad base-64
BETA    -> ✅ SUCCESS
GAMMA   -> ✅ SUCCESS

详细数据分析

在经过多轮上下文的硬指令以后，生成了一份输出详细信息的测试代码，不得不说这一点上 AI 的效率真的是太高了：

import android.util.Base64;
import javax.crypto.Cipher;
import javax.crypto.spec.IvParameterSpec;
import javax.crypto.spec.SecretKeySpec;
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;

public class FileDecryptDebug {
        
    // 模拟新版本读取
    public static String readNewWay(File file, String fileName) {
        System.out.println("\n=== " + fileName + " ===");
        System.out.println("File size: " + file.length() + " bytes");
        
        try (FileInputStream fis = new FileInputStream(file)) {
            byte[] data = new byte[(int) file.length()];
            int readCount = fis.read(data);
            
            System.out.println("Read count: " + readCount);
            System.out.print("First 10 bytes (hex): ");
            for (int i = 0; i < Math.min(10, data.length); i++) {
                System.out.printf("%02X ", data[i]);
            }
            System.out.println();
            
            // 方式1：直接 new String（你的代码）
            String directString = new String(data, StandardCharsets.UTF_8);
            System.out.println("Direct String length: " + directString.length());
            System.out.print("First 10 chars (Unicode): ");
            for (int i = 0; i < Math.min(10, directString.length()); i++) {
                char c = directString.charAt(i);
                System.out.printf("U+%04X ", (int)c);
            }
            System.out.println();
            
            System.out.print("First 10 chars (visible): ");
            for (int i = 0; i < Math.min(10, directString.length()); i++) {
                char c = directString.charAt(i);
                if (c >= 0x20 && c < 0x7F) {
                    System.out.print(c);
                } else {
                    System.out.printf("[%02X]", (int)c);
                }
            }
            System.out.println();
            
            // 检查前两个字节作为长度头
            if (data.length >= 2) {
                int lenFromHeader = ((data[0] & 0xFF) << 8) | (data[1] & 0xFF);
                System.out.printf("Length from header: %d (0x%04X)%n", lenFromHeader, lenFromHeader);
                System.out.printf("Data length - 2 = %d%n", data.length - 2);
                System.out.printf("Is writeUTF format? %s%n", (lenFromHeader == data.length - 2));
            }
            
            // 尝试 Base64 解码
            try {
                System.out.println("\n--- Attempting direct Base64 decode ---");
                byte[] decoded = Base64.decode(directString, Base64.NO_WRAP);
                System.out.println("✅ Direct decode SUCCESS! Decoded " + decoded.length + " bytes");
                
                // 尝试解密
                try {
                    String decrypted = aesCTRPKCS5Decrypt(directString, fileName);
                    System.out.println("✅ Decryption SUCCESS: " + 
                        (decrypted != null ? decrypted.substring(0, Math.min(50, decrypted.length())) + "..." : "null"));
                    return decrypted;
                } catch (Exception e) {
                    System.out.println("❌ Decryption FAILED: " + e.getMessage());
                }
            } catch (IllegalArgumentException e) {
                System.out.println("❌ Direct decode FAILED: " + e.getMessage());
                
                // 尝试跳过前两个字符
                if (directString.length() > 2) {
                    try {
                        System.out.println("\n--- Attempting decode after skipping first 2 chars ---");
                        String skipped = directString.substring(2);
                        System.out.println("Skipped string starts with: " + 
                            skipped.substring(0, Math.min(20, skipped.length())));
                        byte[] decoded = Base64.decode(skipped, Base64.NO_WRAP);
                        System.out.println("✅ Skip-2 decode SUCCESS! Decoded " + decoded.length + " bytes");
                        
                        // 尝试解密
                        try {
                            String decrypted = aesCTRPKCS5Decrypt(skipped, fileName);
                            System.out.println("✅ Skip-2 decryption SUCCESS!");
                            return decrypted;
                        } catch (Exception ex) {
                            System.out.println("❌ Skip-2 decryption FAILED: " + ex.getMessage());
                        }
                    } catch (IllegalArgumentException e2) {
                        System.out.println("❌ Skip-2 decode also FAILED: " + e2.getMessage());
                    }
                }
                
                // 尝试使用 DataInputStream.readUTF（模拟旧版本读取）
                try (FileInputStream fis2 = new FileInputStream(file);
                     DataInputStream dis = new DataInputStream(fis2)) {
                    System.out.println("\n--- Attempting DataInputStream.readUTF() ---");
                    String utfString = dis.readUTF();
                    System.out.println("readUTF() result length: " + utfString.length());
                    System.out.println("readUTF() starts with: " + 
                        utfString.substring(0, Math.min(20, utfString.length())));
                    
                    try {
                        byte[] decoded = Base64.decode(utfString, Base64.NO_WRAP);
                        System.out.println("✅ readUTF + Base64 decode SUCCESS!");
                        
                        String decrypted = aesCTRPKCS5Decrypt(utfString, fileName);
                        System.out.println("✅ readUTF + decryption SUCCESS!");
                        return decrypted;
                    } catch (Exception ex) {
                        System.out.println("❌ readUTF decode/decrypt FAILED: " + ex.getMessage());
                    }
                } catch (Exception ex) {
                    System.out.println("❌ readUTF() FAILED: " + ex.getMessage());
                }
            }
            
        } catch (Exception e) {
            System.err.println("Read error: " + e.getMessage());
            e.printStackTrace();
        }
        
        return null;
    }
    
    public static void main(String[] args) {
        // 设置文件路径
        String basePath = "./"; // 修改为你的文件所在目录
        
        String[] fileNames = {"ALPHA", "BETA", "GAMMA"};
        
        for (String fileName : fileNames) {
            File file = new File(basePath + fileName);
            if (file.exists()) {
                readNewWay(file, fileName);
            } else {
                System.out.println("File not found: " + fileName);
            }
            System.out.println("\n" + "=".repeat(60));
        }
    }
}

运行结果如下:

=== ALPHA ===
Original bytes: 03 58 56 55 35 66 66 4D 71 62 ...
String length: 858
First 10 Unicode: U+0003 U+0058 U+0056 U+0055 U+0035 U+0066 U+0066 U+004D U+0071 U+0062
Base64 decode result: ❌ bad base-64

=== BETA ===
Original bytes: 0B 2C 44 63 55 73 32 77 68 6A ...
String length: 2862
First 10 Unicode: U+000B U+002C U+0044 U+0063 U+0055 U+0073 U+0032 U+0077 U+0068 U+006A
Base64 decode result: ✅ SUCCESS

=== GAMMA ===
Original bytes: 00 18 2B 47 67 6D 47 30 4E 30 ...
String length: 26
First 10 Unicode: U+0000 U+0018 U+002B U+0047 U+0067 U+006D U+0047 U+0030 U+004E U+0030
Base64 decode result: ✅ SUCCESS

到这里，终于提出了核心的问题为什么只有 U+0003 导致 Base64 解码失败？，这也是我一直困惑的点

不断地试错

AI 推测，可能是android.util.Base64#decode时 process 过程可能出现了状态错误。基于这个方向重新调整了测试代码，加入了解码过程的状态机日志，输出每个字节的 DECODE 表值：：

import java.io.*;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;

public class PreciseBase64Debug {
    
    public static void main(String[] args) throws Exception {
        String[] fileNames = {"ALPHA", "BETA", "GAMMA"};
        
        for (String fileName : fileNames) {
            File file = new File(fileName);
            if (!file.exists()) {
                System.out.println("File not found: " + fileName);
                continue;
            }
            
            System.out.println("\n=== " + fileName + " ===");
            byte[] fileBytes = readFileBytes(file);
            
            // 显示原始字节
            System.out.print("Original bytes (first 10): ");
            for (int i = 0; i < Math.min(10, fileBytes.length); i++) {
                System.out.printf("%02X ", fileBytes[i]);
            }
            System.out.println();
            
            // 提取长度头
            int lenFromHeader = ((fileBytes[0] & 0xFF) << 8) | (fileBytes[1] & 0xFF);
            System.out.printf("Length header: %d (0x%04X)%n", lenFromHeader, lenFromHeader);
            System.out.printf("File size - 2: %d%n", fileBytes.length - 2);
            
            // 方法1：模拟你的代码 - new String(byte[])
            String str1 = new String(fileBytes);
            System.out.println("\n--- Method 1: new String(byte[]) ---");
            analyzeString(str1, "default charset");
            
            // 方法2：指定 UTF-8
            String str2 = new String(fileBytes, StandardCharsets.UTF_8);
            System.out.println("\n--- Method 2: new String(byte[], UTF-8) ---");
            analyzeString(str2, "UTF-8");
            
            // 方法3：跳过前2字节后构造字符串
            byte[] dataWithoutHeader = Arrays.copyOfRange(fileBytes, 2, fileBytes.length);
            String str3 = new String(dataWithoutHeader, StandardCharsets.UTF_8);
            System.out.println("\n--- Method 3: Skip header + UTF-8 ---");
            analyzeString(str3, "UTF-8 (no header)");
            
            // 测试 Base64 解码
            System.out.println("\n--- Base64 Decode Tests ---");
            testBase64Decode(str1, "Method 1");
            testBase64Decode(str2, "Method 2");
            testBase64Decode(str3, "Method 3");
        }
    }
    
    private static byte[] readFileBytes(File file) throws IOException {
        try (FileInputStream fis = new FileInputStream(file)) {
            byte[] data = new byte[(int) file.length()];
            int read = fis.read(data);
            if (read != data.length) {
                throw new IOException("Incomplete read");
            }
            return data;
        }
    }
    
    private static void analyzeString(String str, String charsetName) {
        System.out.println("String length: " + str.length());
        
        // 显示前10个字符的 Unicode 码点
        System.out.print("First 10 Unicode: ");
        for (int i = 0; i < Math.min(10, str.length()); i++) {
            char c = str.charAt(i);
            System.out.printf("U+%04X ", (int) c);
        }
        System.out.println();
        
        // 检查控制字符
        System.out.print("Control chars at start: ");
        for (int i = 0; i < Math.min(5, str.length()); i++) {
            char c = str.charAt(i);
            if (c < 0x20) {
                System.out.printf("[%d]=U+%04X ", i, (int) c);
            } else {
                System.out.printf("[%d]='%c' ", i, c);
            }
        }
        System.out.println();
        
        // 测试 getBytes() 往返
        byte[] bytes1 = str.getBytes();
        byte[] bytes2 = str.getBytes(StandardCharsets.UTF_8);
        
        System.out.print("getBytes() first 10: ");
        for (int i = 0; i < Math.min(10, bytes1.length); i++) {
            System.out.printf("%02X ", bytes1[i]);
        }
        System.out.println();
        
        System.out.print("getBytes(UTF-8) first 10: ");
        for (int i = 0; i < Math.min(10, bytes2.length); i++) {
            System.out.printf("%02X ", bytes2[i]);
        }
        System.out.println();
        
        // 比较字节数组
        boolean same = Arrays.equals(bytes1, bytes2);
        System.out.println("getBytes() == getBytes(UTF-8)? " + same);
    }
    
    private static void testBase64Decode(String str, String methodName) {
        try {
            // 模拟 Base64.decode(String, int) 的过程
            byte[] strBytes = str.getBytes();
            
            // 检查前几个字节在 DECODE 表中的值
            System.out.print(methodName + " - DECODE values (first 5): ");
            for (int i = 0; i < Math.min(5, strBytes.length); i++) {
                int b = strBytes[i] & 0xFF;
                int decodeValue = getDecodeValue(b);
                System.out.printf("[%d]=0x%02X->%2d ", i, b, decodeValue);
            }
            System.out.println();
            
            // 实际解码测试（使用你的 Android Base64 实现）
            boolean success = testAndroidBase64Decode(strBytes);
            System.out.println(methodName + " - Decode result: " + (success ? "✅ SUCCESS" : "❌ FAILED"));
            
        } catch (Exception e) {
            System.out.println(methodName + " - Exception: " + e.getClass().getSimpleName() + ": " + e.getMessage());
        }
    }
    
    private static int getDecodeValue(int b) {
        // 简化的 DECODE 表（只包含关键部分）
        if (b >= 'A' && b <= 'Z') return b - 'A';
        if (b >= 'a' && b <= 'z') return b - 'a' + 26;
        if (b >= '0' && b <= '9') return b - '0' + 52;
        if (b == '+') return 62;
        if (b == '/') return 63;
        if (b == '=') return -2;
        return -1;  // SKIP
    }
    
    private static boolean testAndroidBase64Decode(byte[] input) {
        // 简化版的 Android Base64 解码逻辑
        int state = 0;
        int value = 0;
        
        for (int i = 0; i < input.length; i++) {
            int d = getDecodeValue(input[i] & 0xFF);
            
            switch (state) {
                case 0:
                    if (d >= 0) {
                        value = d;
                        state = 1;
                    } else if (d != -1) {  // d != SKIP
                        System.out.print("[FAIL at pos " + i + ": d=" + d + "] ");
                        return false;
                    }
                    break;
                case 1:
                    if (d >= 0) {
                        value = (value << 6) | d;
                        state = 2;
                    } else if (d != -1) {
                        System.out.print("[FAIL at pos " + i + ": d=" + d + "] ");
                        return false;
                    }
                    break;
                case 2:
                    if (d >= 0) {
                        value = (value << 6) | d;
                        state = 3;
                    } else if (d == -2) {  // EQUALS
                        state = 4;
                    } else if (d != -1) {
                        System.out.print("[FAIL at pos " + i + ": d=" + d + "] ");
                        return false;
                    }
                    break;
                case 3:
                    if (d >= 0) {
                        state = 0;
                    } else if (d == -2) {
                        state = 5;
                    } else if (d != -1) {
                        System.out.print("[FAIL at pos " + i + ": d=" + d + "] ");
                        return false;
                    }
                    break;
                case 4:
                    if (d == -2) {
                        state = 5;
                    } else if (d != -1) {
                        System.out.print("[FAIL at pos " + i + ": d=" + d + "] ");
                        return false;
                    }
                    break;
                case 5:
                    if (d != -1) {
                        System.out.print("[FAIL at pos " + i + ": d=" + d + "] ");
                        return false;
                    }
                    break;
            }
        }
        
        // 检查最终状态
        return state == 0 || state == 5;
    }
}

结果终于找到了错误的位置：

// 省略了其他正常部分日志打印
Method 1 - DECODE values (first 5): [0]=0x03->-1 [1]=0x58->23 [2]=0x56->21 [3]=0x55->20 [4]=0x35->57 
[FAIL at pos 857: d=-2] Method 1 - Decode result: ❌ FAILED

失败的位置是索引 857，错误值是 -2，即 Base64 的 padding 字符 '='！

解码器处理文件末尾的内容...7slOg==时，已经成功跳过了开头的 0x03，但在处理末尾的 = 时状态机崩溃了。

失败位置竟在文件末尾

为了理解为什么 padding 会导致失败，需要查看 Base64 解码的状态机逻辑（简化自 Android Base64.Decoder.process() 方法）：

int d = alphabet[input[p++] & 0xff];
switch (state) {
    case 0:
        if (d >= 0) { value = d; ++state; }
        else if (d != SKIP) { this.state = 6; return false; }
        break;
    case 1:
        if (d >= 0) { value = (value << 6) | d; ++state; }
        else if (d != SKIP) { this.state = 6; return false; }
        break;
    case 2:
        if (d >= 0) { value = (value << 6) | d; ++state; }
        else if (d == EQUALS) { /* 期待第二个 '=' */ state = 4; }
        else if (d != SKIP) { this.state = 6; return false; }
        break;
    case 3:
        if (d >= 0) { /* 输出3字节并回到 state 0 */ }
        else if (d == EQUALS) { /* 期待结束 */ state = 5; }
        else if (d != SKIP) { this.state = 6; return false; }
        break;
    case 4:
        if (d == EQUALS) { ++state; }
        else if (d != SKIP) { this.state = 6; return false; }
        break;
    case 5:
        if (d != SKIP) { this.state = 6; return false; }
        break;
}

Base64 标准规定：原始字节数模 3 余 1 则补两个=，余 2 则补一个=，整除则不补。解码器根据输入字符数动态调整状态，=只能在特定状态出现且数量必须匹配。

对齐错位：被跳过的控制字符留下的隐患

现在可完整还原 ALPHA 的解码过程：

文件总长 858 字节，前两字节为长度头0x03 0x58。
实际 Base64 密文为 856 字节，内容为：[0x03] + VU5ffMqbL84YIaKc7Qq...（剩余 855 字节）。
Base64 解码器读取时：
- 位置 0：遇到0x03（DECODE 值 -1），被跳过，状态机保持 state = 0。
- 位置 1～855：正常 Base64 字符，被正常处理。
- 关键：因0x03被跳过，解码器实际处理的字符总数变成 855（而非 856）。
855 除以 4 余 3，按规则应补 1 个=。
但 ALPHA 的 Base64 字符串末尾是 2 个=。
当解码器处理完第一个=后进入 state 4，期待第二个=以进入 state 5。然而由于整体字符数少了一个，第二个=在状态机中出现的时机错位，最终触发 return false。

简言之：被跳过的控制字符打乱了 Base64 的 4 字节分组对齐，使得 padding 字符出现在状态机预期之外。

为什么 BETA、GAMMA 幸免于难

文件	控制字符情况	Base64 长度	是否包含 padding
BETA	跳过一个 0x0B	2860 字节（4 的倍数）	无`=`
GAMMA	跳过两个 0x00 0x18	24 字节（4 的倍数）	无`=`
ALPHA	跳过一个 0x03	856 字节（4 的倍数，但原始 Base64 有 padding）	有`==`

BETA 和 GAMMA 的 Base64 数据长度本身为 4 的倍数，且原始密文不需要 padding，因此即使开头控制字符被跳过，末尾也没有=可供错位，解码器得以“成功”完成。而 ALPHA 的原始密文含有 padding，触碰了边界条件。

触发条件的精确定义

这个 Bug 需同时满足以下条件：

文件由writeUTF()写入，带有 2 字节长度头；
长度头的高位字节落在0x00-0x1F范围内（即文件总长 < 8192 字节），从而被 Base64 解码器视为SKIP跳过；
被跳过的控制字符数量为奇数（本例中为 1 个字节）；
原始 Base64 密文包含=padding（即原始明文长度不是 3 的倍数）。

只有当这三个条件全部满足时，才会触发bad-base64异常。

解决方案

参照之前的处理，读取时返回空数据，由上层重新拉取数据并以新格式写入，避免了依赖 Base64 解码器对控制字符的隐式容错

public static String readOrCreate(String fileName) {
    File file = getFile(fileName);
    try (FileInputStream fis = new FileInputStream(file)) {
        byte[] data = new byte[(int) file.length()];
        int readCount = fis.read(data);
        if (readCount <= 0) return null;
        // 检测是否为 writeUTF 格式
        if (data.length > 2) {
            int lengthFromHeader = ((data[0] & 0xff) << 8) | (data[1] & 0xff);
            if (lengthFromHeader == data.length - 2) {
                return null; // 旧格式，触发重新拉取
            }
        }
        return Base64.decodeToString(new String(data, StandardCharsets.UTF_8));
    } catch (Exception e) {
        return null;
    }
}

总结与反思

为AI提供完整的上下文： 分析初期由于环境信息、文件内容及现象描述不够完整，导致 AI 与自身思路多次走入死胡同，耗费了大量时间。当问题描述足够精准后，分析路径迅速收敛。
不要想当然： 观察到的现象未必等于实际发生的逻辑。复杂系统中多层的异常捕获可能掩盖问题的真实发生点，排查时不能停留在表面，需要深入验证每一步假设。
关注全链路： 早期排查目光过度集中于文件开头的字节，直到加入状态机日志后才发现失败点竟位于文件末尾。全局视角往往能避免陷入局部最优的误区。
模拟环境提升效率： 通过提取 Android 源码、搭建本地运行环境并复现异常文件，在 PC 端实现了快速迭代验证，可以缩短排查周期。
不止于修复，追问根本原因 问题看似解决后，仍需多问一个“为什么”。仅满足于程序跑通而放弃深究，只会让同类隐患在未来以另一种形式重现。

可复现的 Demo

import android.util.Base64;

import java.io.*;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;

public class Test {

    public static void main(String[] args) throws IOException {
        write();
        System.out.println("read(false): " + read(false));
        System.out.println("read(true): " + read(true));
    }

    private static void write() throws IOException {
        // 生成一个触发文件
        byte[] rawData = new byte[34]; // 选一个长度，使 Base64 编码后长度满足条件
        for (int i = 0; i < rawData.length; i++) {
            rawData[i] = (byte) (65 + (i % 26));
        }
        String base64 = Base64.encodeToString(rawData, Base64.NO_WRAP);
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        DataOutputStream dos = new DataOutputStream(baos);
        dos.writeUTF(base64);
        byte[] writeUTFBytes = baos.toByteArray();
        Files.write(Paths.get("outputs", "trigger_bug.txt"), writeUTFBytes);
    }

    private static String read(boolean fix) throws IOException {
        // 使用 readUTF 读取
        File file = Paths.get("outputs", "trigger_bug.txt").toFile();
        try (FileInputStream fis = new FileInputStream(file)) {
            byte[] data = new byte[(int) file.length()];
            int readCount = fis.read(data);
            String cipherText;
            if (fix && readCount >= 2) {
                int utfLen = ((data[0] & 0xFF) << 8) | (data[1] & 0xFF);
                if (utfLen + 2 == readCount) {
                    // 旧格式，跳过前两个字节
                    cipherText = new String(data, 2, utfLen, StandardCharsets.UTF_8);
                } else {
                    // 新格式，使用全部字节
                    cipherText = new String(data, 0, readCount, StandardCharsets.UTF_8);
                }
            } else {
                cipherText = new String(data, 0, readCount, StandardCharsets.UTF_8);
            }
            try {
                byte[] decoded = Base64.decode(cipherText, Base64.NO_WRAP);
                return new String(decoded, StandardCharsets.UTF_8);
            } catch (Exception e) {
                return "Base64 decode FAILED: " + e.getMessage();
            }
        }
    }

}

亚马逊云科技技术品牌专区

更多推荐

Kiro Editor 开发实战：使用 Cargo 构建、测试与性能优化指南

欢迎来到这篇终极指南，我们将深入探索如何使用Rust构建高性能的终端文本编辑器Kiro Editor。无论你是Rust新手还是经验丰富的开发者，这篇完整教程将带你了解如何利用Cargo工具链进行高效的开发、测试和性能优化，打造一款快速、轻量且功能强大的UTF-8文本编辑器。## 什么是Kiro Editor？Kiro Editor是一款使用Rust编写的极简终端文本编辑器，它最初是著名编辑

亚马逊云科技技术品牌专区

Kimi 智能助手新手入门与实战指南

在处理长篇技术文档或行业研报时，我们往往只需要其中的核心结论或特定数据。利用 AI 进行长文档摘要，可以极大缩短信息获取周期。操作时，直接将文档内容复制粘贴到对话框中（注意遵守平台的长度限制，若超长可分段处理），然后配合精准的指令。假设你手头有一份五十页的《云计算架构演进趋势报告》，你可以输入：“请阅读以下关于云计算架构的报告内容。首先，用不超过 200 字总结全文的核心观点。其次，提取出文中提到