471. Encode String with Shortest Length
Given anon-emptystring, encode the string such that its encoded length is the shortest.
The encoding rule is:k[encoded_string]
, where theencoded_stringinside the square brackets is being repeated exactlyktimes.
Note:
- k will be a positive integer and encoded string will not be empty or have extra space.
- You may assume that the input string contains only lowercase English letters. The string's length is at most 160.
- If an encoding process does not make the string shorter, then do not encode it. If there are several solutions, return any of them is fine.
Example 1:
Input: "aaa"
Output: "aaa"
Explanation: There is no way to encode it such that it is shorter than the input string, so we do not encode it.
Example 2:
Input: "aaaaa"
Output: "5[a]"
Explanation: "5[a]" is shorter than "aaaaa" by 1 character.
Example 3:
Input: "aaaaaaaaaa"
Output: "10[a]"
Explanation: "a9[a]" or "9[a]a" are also valid solutions, both of them have the same length = 5, which is the same as "10[a]".
Example 4:
Input: "aabcaabcd"
Output: "2[aabc]d"
Explanation: "aabc" occurs twice, so one answer can be "2[aabc]d".
Example 5:
Input: "abbbabbbcabbbabbbc"
Output: "2[2[abbb]c]"
Explanation: "abbbabbbc" occurs twice, but "abbbabbbc" can also be encoded to "2[abbb]c", so one answer can be "2[2[abbb]c]".
S: DP O(n^3)
dp[i][j]存储i到j的substring的最小encode string
初始状态:len < 4时encode无效,dp[i][j]等于原substring
转移方程:将subtring拆分为左右两个string,
对1 <= k <= j 若左右string长度之和小于原substring长度 dp[i][j] = dp[i][k] + dp[k + 1][j]
注意每次要检测substring自生是否由重复数组构成,这里检测用到一个trick:
假设S 由 T 重复而成,那么将S + S去掉首字符之后依然可以在其中找到S, 且第一个S的index就是重复开始的位置
比如ababab => abababababab => (a)bababababab 重复开始的位置为2,重复的substring为ab
class Solution {
public:
string encode(string s) {
int n = s.size();
dp = vector<vector<string>>(n, vector<string>(n));
//loop through all substr length, at all posible start pos
for (int l = 1; l <= n; ++l) {
for (int i = 0; i + l <= n; ++i) {
int j = i + l - 1;
dp[i][j] = s.substr(i, l);
if (l <= 4) continue;
for (int k = i; k < j; ++k) {
if (dp[i][j].size() > dp[i][k].size() + dp[k + 1][j].size()) {
dp[i][j] = dp[i][k] + dp[k + 1][j];
}
}
string selfEncode = findRepeat(s, i, j);
if (selfEncode.size() < dp[i][j].size()) dp[i][j] = selfEncode;
}
}
return dp[0][n - 1];
}
string findRepeat(string& s, int i, int j) {
string str = s.substr(i, j - i + 1);
int repeatIdx = (str + str).find(str, 1); //从第一位开始找,忽略第0位
if (repeatIdx < str.size()) {
string substr = str.substr(0, repeatIdx);
return to_string(str.size() / substr.size()) + '[' + dp[i][i + repeatIdx - 1] + ']'; //插入的是dp值而不是原substr
}
return s;
}
private:
vector<vector<string>> dp;
};