The post Leetcode 18. 4Sum appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/4sum/

This is another N-Sum problem. We may start by thinking “is the HashMap approach in 2Sum valid here”? The answer is yes if we use two loops to convert this problem into 2sum and use the hashmap approach solving it. Be aware that we need to sort the array to avoid duplication. In fact, this is no time complexity benefit in this case. The sorting takes O(N*log(N)), and the two loops take O(N^2 * N), so the entire time complexity takes O(N^3 + N * log(N)) which is O(N^3).

This is the hash map solution.

public class Solution { public List<List<Integer>> fourSum(int[] nums, int target) { List<List<Integer>> ret = new ArrayList<List<Integer>>(); Arrays.sort(nums); HashMap<Integer, Integer> map = new HashMap<Integer, Integer>(); for (int i = 0 ; i < nums.length; i ++) { map.put(nums[i], i); } for (int i = 0; i < nums.length; i ++) { for (int j = i + 1; j < nums.length; j ++) { List<List<Integer>> list = twoSum(nums, j + 1, target - nums[i] - nums[j], map); for (int m = 0; m < list.size(); m ++) { list.get(m).add(nums[i]); list.get(m).add(nums[j]); if (!ret.contains(list.get(m))) { ret.add(list.get(m)); } } } } return ret; } public List<List<Integer>> twoSum(int[] nums, int start, int target, HashMap<Integer, Integer> map) { List<List<Integer>> ret = new ArrayList<List<Integer>>(); for (int i = start; i < nums.length; i ++) { if (map.containsKey(target - nums[i]) && map.get(target - nums[i]) > i) { List<Integer> list = new ArrayList<Integer>(); list.add(nums[i]); list.add(nums[map.get(target - nums[i])]); if (!ret.contains(list)) { ret.add(list); } } } return ret; } }

The two-pointer solution we have mentioned here is actually easier to understand, although it has the same complexity. Similarly, it uses nested loops, and use the two-pointer approach for the last two numbers. This is the solution using two-pointer:

public class Solution { public List<List<Integer>> fourSum(int[] nums, int target) { List<List<Integer>> ret = new ArrayList<List<Integer>>(); Arrays.sort(nums); for (int one = 0; one < nums.length; one ++) { for (int two = one + 1; two < nums.length; two ++) { int three = two + 1; int four = nums.length - 1; while (three < four) { if (nums[one] + nums[two] + nums[three] + nums[four] > target) { four --; } else if (nums[one] + nums[two] + nums[three] + nums[four] < target) { three ++; } else { List<Integer> tmp = new ArrayList<Integer>(); tmp.add(nums[one]); tmp.add(nums[two]); tmp.add(nums[three]); tmp.add(nums[four]); if (!ret.contains(tmp)) { ret.add(tmp); } three++; four--; } } } } return ret; } }

- For N-Sum problems, HashMap is best for 2Sum because it has a linear complexity, comparing to the O(N * log(N)) complexity of two-pointer. Starting from 3Sum, 4Sum, etc., the HashMap will no longer have the complexity benefit because of the outer loop(s), so both solution will have O(N^2) for 3Sum, O(N^3) for 4Sum, etc. Since two-pointer is easier to implement and more flexible (for example, it can also be used in 3Sum Closest), we recommend that: use HashMap for 2Sum and use two-pointers for 3Sum, 3Sum closest, 4Sum and so on.

The post Leetcode 18. 4Sum appeared first on Qi Xi 西奇.

]]>The post Leetcode 16. 3Sum Closest appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/3sum-closest/

I came up with a greedy approach at the first place: each time, find the closest to the target, and then subtract this value from the target value and remove it from the array, and then recursively compute. However, this approach did not work for all cases. For example, it will fail in the following case:

[0,2,1,-3] 1 Output: 3 Expected: 0

The reason of the failure of greedy approach is because there can be negative numbers. Selecting a large number at any step is possible because there may be a negative number which can be paired with this large number to get a closer sum to the target. For example,

[-100,1,100] target = 0, a greedy approach will force us to select

1at the first step, but actually this is not true.

The second though was to solve it as 3Sum or 2Sum using HashMap (this solution was described here), however, it is obvious that without a concrete target value (because their sum does not have to be equal to the target), it is impossible to use the HashMap solution, because HashMap can only tell if there is a value or not but we are going to compare their differences with the target.

Then it becomes obvious that we need to use two pointer approach. The 3Sum problem can also be solved using two pointer approach, but HashMap is faster.

The solution was to sort the entire array and then set aside the first number. Initialize the second number (“start”) as the next one of the first number, and the third number (“end”) to be the end of the array. If their sum is less than the target, move the second number forward; similarly, if the sum is greater than the target, move the third number backward. Once the second number meets the third number, move the first number forward and repeat this process. For each comparison, record the minimum difference between the target and the sum. If at any time their sum equals to the target, just simply return it and end the program.

public class Solution { public int threeSumClosest(int[] nums, int target) { Arrays.sort(nums); int mindiff = Integer.MAX_VALUE; int min = 0; for (int i = 0; i < nums.length; i ++) { int start = i + 1; int end = nums.length - 1; while (start < end) { if (Math.abs(target - (nums[i] + nums[start] + nums[end])) < mindiff) { min = nums[i] + nums[start] + nums[end]; mindiff = Math.abs(target - (nums[i] + nums[start] + nums[end])); } if (nums[i] + nums[start] + nums[end] < target) { start ++; } else if (nums[i] + nums[start] + nums[end] > target) { end --; } else { return min; } } } return min; } }

- Two-pointer is also a solution for N-Sum problems, in addition to HashMap.
- The basic requirement for greedy is that anyone except the “greedy” one is impossible at any step. If there is a possibility that a “non-greedy” selection could be the answer, then do not use greedy approach.

The post Leetcode 16. 3Sum Closest appeared first on Qi Xi 西奇.

]]>The post Leetcode 9. Palindrome Number appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/palindrome-number/

If this was a string, there are many ways to check if it is palindromic: we can use two pointers from start and end, we can also reverse the string and compare them. However, for an integer, if we convert it into a string and then check if it is palindromic, it will require more than constant space. If we can get the reverse integer using constant space and compare the new integer with the original one, that would be great. Recall that in this post: Reverse Integer, we have already had a linear solution with constant space to get a reverse of an integer. What we need to do is to just copy this solution and make a simple comparison. An exception is that if the integer is negative, it will never be a palindrome because of the negative sign.

public class Solution { public boolean isPalindrome(int x) { if (x<0) { return false; } int comp = 0; int copy = x; while (copy != 0) { comp = comp*10+copy%10; copy = copy/10; } return x == comp; } }

- To check if a string/integer is palindrome: two-pointer or “reverse and compare”.
- Recall that
%

is a great way get digits of a number one by one.

The post Leetcode 9. Palindrome Number appeared first on Qi Xi 西奇.

]]>The post Leetcode 8. String to Integer (atoi) appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/string-to-integer-atoi/

The main focus this problem is to come up with all corner cases. This requires a comprehensive communication with the interviewer before writing any code.

I won’t spend much time on this because there aren’t many algorithmic takeaways.

public class Solution { public int myAtoi(String str) { long result = 0; int sign = 0; int digit = 0; int space = 0; for(int i = str.length()-1 ; i >= 0 ; i --) { char tmp = str.charAt(i); if (tmp == ' ') { space = 1; } else if (sign == 0 && tmp == '-') { sign = 1; if (space == 1) { result = 0; digit = 0; space = 0; } else if (result == Long.MAX_VALUE) { return Integer.MIN_VALUE; } else { result = result * -1; } } else if (sign == 0 && tmp == '+') { sign = 1; if (space == 1) { result = 0; digit = 0; space = 0; } } else if (tmp >= '0' && tmp <= '9') { if (space == 1) { result = 0; digit = 0; space = 0; } long add = (long)(Character.getNumericValue(tmp) * Math.pow(10, digit)); if (Long.MAX_VALUE - result < add) { continue; } result = result + add; digit ++; } else { result = 0; digit = 0; } } if (result > (long)Integer.MAX_VALUE) { return Integer.MAX_VALUE; } else if (result < (long)Integer.MIN_VALUE) { return Integer.MIN_VALUE; } return (int)result; } }

- Clarify with the interviewer before writing any code

The post Leetcode 8. String to Integer (atoi) appeared first on Qi Xi 西奇.

]]>The post Leetcode 7. Reverse Integer appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/reverse-integer/

It is a relatively simple problem, but it is also a hard problem.

It is simple because it seems that there are many ways to solve it: we can build a string and then convert it to number; we can convert the original string character array and then convert into an array of digits and calculate the result, convert into a character array and do in-place swaps, etc. Among all these methods, we don’t know which one should we started with, and what are the traps within each of these.

We still want to solve this problem in a beautiful way, so if we aim to solve it in a linear solution with constant space, we can tell that we may need to use a “digit by digit” solution.

How many ways we can get a single digit of a number? We can convert the number to a string and get each character, and then convert characters to digits; we can also use

%10to get the least significant bit and use

/10to get the number without the least significant bit, and then repeat the process to get each bit. The second approach is obviously more elegant:

number /10 %10 12345 1234 5 1234 123 4 123 12 3 12 1 2 1 0 1

We can find out that the result of

%10just inverts the digits one by one. To make these result into a number, we can simply do this:

(((((5 * 10) + 4) * 10 + 3) * 10 + 2) * 10 + 1

Each time, multiply the previous result with 10 and add the current result. So we can write the code like this. You might want to separate the situation of positive and negative or take an absolute value, but actually with the loops condition

cur != 0this solution works for both positive and negative.

public class Solution { public int reverse(int x) { int ret = 0; int cur = x; int tmp; while (cur != 0) { tmp = cur % 10; ret = ret * 10 + tmp; cur = cur / 10; } return ret; } }

Another issue if overflow. We should always be aware of overflow when dealing with numbers. In this case, we can see that overflow could happen during each iteration of the loop. One of the ways of detect overflow is use

longas the result type, and check if the result is outside the integer boundary before return. However, we are doing something more efficient here: we will detect overflow proactively. The step may cause overflow is clearly the step

ret = ret * 10 + tmp;So let’s check whether the result will be overflow before the calculation. We just need to add

if ((Integer.MAX_VALUE - Math.abs(tmp)) / 10 < Math.abs(ret)) { return 0; }

before this line. What it does is a reverse check that “if there is enough room within the integer range to perform this operation”.

Therefore, here is the final version of our code:

public class Solution { public int reverse(int x) { int ret = 0; int cur = x; int tmp; while (cur != 0) { tmp = cur % 10; if ((Integer.MAX_VALUE - Math.abs(tmp)) / 10 < Math.abs(ret)) { return 0; } ret = ret * 10 + tmp; cur = cur / 10; } return ret; } }

- Using
%

is a good way to get each digit of a number.%

will give us the least significant bit and/

will give us the rest. Repeating the process on the rest will give us each single digit of the number from right to left. This is a nice approach because this works for any base. - To detect overflow, we recommend using proactive detection. To generalize, for example, if
x + 10

could overflow, we just need to check ifInteger.MAX_VALUE - 10 < x

before doing the addition. ifx - 10

could overflow, we just need to check ifInteger.MIN_VALUE + 10 > x

. These are just basic transformations ofx + 10 > Integer.MAX_VALUE

andx - 10 < Integer.MIN_VALUE

because if overflow indeed occurs, the result ofx + 10

orx - 10

will wrap around. Ifx + y

could overflow and we do not know whether they are positive or negative, we can simply check ifInteger.MAX_INT - Math.abs(y) < Math.abs(x)

The post Leetcode 7. Reverse Integer appeared first on Qi Xi 西奇.

]]>The post Leetcode 6. ZigZag Conversion appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/zigzag-conversion/

I won’t spend much time on this because there are not many generic approaches to take away.

You may wonder: is there any relationship between the index in the 2-d array and the index in the string? Then you may spend a lot of time to figure out the relationship and finally got nothing.

The right thing we need to do is: just write a program doing exactly like what you would do on a piece of paper. Then we can improve on the working version. The approach applies especially to the kind of problems like a game, maze, a word puzzle, etc., something seems to be not algorithm-heavy.

public class Solution { public String convert(String s, int numRows) { ArrayList<Character>[] chars = new ArrayList[numRows]; for (int i = 0; i < chars.length; i++) { chars[i] = new ArrayList<Character>(); } int order = 1; // 1: up to down, -1: down to up int last = -1; for (int i = 0; i < s.length(); i++) { if (order == 1) { if (last == numRows - 1) { if (numRows != 1) { last = last - 1; } order = -1; } else { last = last + 1; } chars[last].add(s.charAt(i)); } else { if (last == 0) { if (numRows != 1) { last = 1; } order = 1; } else { last = last - 1; } chars[last].add(s.charAt(i)); } } StringBuilder str = new StringBuilder(); for (int i = 0; i < chars.length; i ++) { for(int j = 0; j < chars[i].size(); j ++) { str.append(Character.toString(chars[i].get(j))); } } return str.toString(); } }

- If a problem seems to be not testing on algorithm or data structure, such as a game, a puzzle, just mock what you will do on a piece of paper. If we have time left, we can try to improve it then.

The post Leetcode 6. ZigZag Conversion appeared first on Qi Xi 西奇.

]]>The post Leetcode 5. Longest Palindromic Substring appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/longest-palindromic-substring/

The problem asks for

Given a string S, find the longest palindromic substring in S. You may assume that the maximum length of S is 1000, and there exists one unique longest palindromic substring.

**REMINDER: read until the end, otherwise you will get it wrong.**

The “longest”. As we have mentioned in previous posts, looking for “longest” substring usually relates to Dynamic Programming or Sliding Window algorithm. Let’s see which one fits the best here.

If we use Dynamic Programming approach, we need to memorize something like “the longest palindromic substring length so far” for each position, or something else; seems feasible.

If we use Sliding Window approach; we might need two pointers: a start and an end. Each time we expand the end pointer, check if the substring formed by start and end pointer is palindromic. When it reaches the end of the string, move start pointer forward a position and do the same thing. Feasible but is is clearly a Brute Force method: it basically tries all possible substrings and checks if each of them is palindromic. We do not want this, let’s take a deeper think of the Dynamic Programming approach then.

We have got some initial thoughts on the DP solution, and we know we are going to use an array to memorize things. If you are not familiar with Dynamic Programming, it is highly recommended that you watch this MIT open course.

We have a couple of choices of what to store in the array:

- “longest palindromic substring length so far” for each position.

" a b c d c a a c d c a a " 1 1 1 1 3 3 3 4 6 8 8 8

- “longest palindromic substring ended by
**this**character” for each position.

" a b c d c a a c d c a a " 1 1 1 1 3 1 2 4 6 8 1 1

If you hand-calculate the array value on a piece of paper, you will find out that the first choice is not good, because each time we have to try all possible substrings before the current character.

The second choice seems okay. However, we need some deeper analyze.

At each character position, how can we check if there is a substring ended by this character that is also palindromic? For example, if we are currently at the position of the question mark.

" a b c d c a a c d c a a " 1 1 1 1 3 1 2 ?

We need to check if there is a palindromic substring ended by this “c”. Looking at the previous position, the array value is 2, which means that the there is one or more palindromic substrings ended by this “a”, and the longest one is of length 2. This is useful, and Dynamic Programming is just to make use previous calculated information.

index 0 1 2 3 4 5 6 7 8 9 10 11 " a b c d c a a c d c a a " value 1 1 1 1 3 1 2 ? ? ? ? ?

Let’s name the memo array as

arr[], and the string as

str

We can know that

str.substring(5,7)(note that the first index of substring in Java is inclusive, and the second index is exclusive) is a palindromic substring “aa” because

arr[6] == 2; if there a palindromic substring ended by index 7, it requires

str.charAt(7) == str.charAt(6-2)or

str.charAt(7) == str.charAt(5)or

arr[7] == arr[6]. If the first case is true,

arr[7]should be

arr[6] + 2because two additional characters (index 4 and index 7) is added to the substring

str.substring(5,7); if the second case is true,

arr[7] = 3. If the third case is true,

arr[7] = 2.If multiple cases are true, choose the largest value, and if neither is true, this array position should just be 1 because a single character itself is also a palindrome.

To generalize, it should be clear that the four cases are just to check: by introducing a character at index n,

- if
str.charAt(n) == str.charAt(n-arr[n-1] - 1)

it can create a palindrome with str.substring(n-arr[n-1]-1,n), and lettmp1 = str.substring(n-arr[n-1]-1,n).length() + 1

- if
str.charAt(n) == str.charAt(n-2)

it can create a palindrome with str.substring(n-2,n), and lettmp2 = 3

- if
str.charAt(n) == str.charAt(n-1)

it can create a palindrome with character before it, and lettmp3 = 2

- if none of above is true, it is a one-character palindorme by itself, and let
tmp4 = 1

We take the max of tmp1, tmp2, tmp3, tmp4 as the current array value.

So far so good.

**HOWEVER, THIS IS WRONG.**

If we have created tests, a simple test will make this fail: “aaaa”

Let’s see what will the input generate at index 3:

index 0 1 2 3 " a a a a " value 1 2 3 ?

The values of the three cases will be, respectively: 1, 3, 2

However, it should be 4.

index 0 1 2 3 4 5 " a a a a a a" value 1 2 3 4 5 ?

Similar here. Why? **Take a minute to think about the reason. It is very important.**

The reason is: a one-dimensional array collapsed the four cases by taking a max value, but all cases should be contributed to later calculations. For example string “aaaa”, this time we do not take the max value, instead, each array position represents “the length of **a** palindromic substring ended by this character”, and we list all possibilities:

index 0 1 2 3 " a a a a" value 1 1 1 ? => A value 1 1 2 ? => B value 1 1 3 ? => C value 1 2 1 ? => D value 1 2 2 ? => E value 1 2 3 ? => F

Now, if we check the “four cases condition” on each of these possibilities, we can eventually get the correct answer from possibility B with the case

if (str.charAt(n) == str.charAt(n-arr[n-1] - 1)). However, this possibility was collapsed/ignored in our original approach.

What should we do now? A one-dimensional array loses information, thus we use a two-dimensional array: boolean arr[][]. We use the first index to indicate the start position, and the second index to indicate the end position, and the value is whether the substring from start position to end position is palindromic. For example, arr[4][6] means the substring(4,7) is palindromic.

With this idea, it is easy to start coding. We need a nested loop, the inner loop starts from the current index of the outer loop because the inner loop indicates the end position.

With all the previous thoughts, it should be easy to write the code. However, there is a place likely to have a mistake.The outer loop is not from 0, instead, it should start from the end of the string to 0. The reason is in the loop, we need to check if arr[i-1][j+1] is true, so in order to have the value ready, we need to iterate the 2-d array in descending i and ascending j order. Here is the code:

public class Solution { public String longestPalindrome(String s) { boolean[][] arr = new boolean[s.length()][s.length()]; int start = 0; int end = 0; int max = 0; for(int i = s.length(); i >=0; i --) { for(int j = i; j < s.length(); j ++) { if (s.charAt(i) == s.charAt(j)) { if (i == j || i+1 == j || i+2 == j || arr[i+1][j-1] == true) { arr[i][j] = true; if (j-i+1 > max) { max = j-i+1; start = i; end = j; } } else { arr[i][j] = false; } } else { arr[i][j] = false; } } } return s.substring(start, end + 1); } }

There is another solution that uses less space (with the same time complexity). The basic idea is simple: iterate the entire string twice. The first time, use each character as the center of a palindrome and try to expand as much as possible (as long as the prefix and suffix characters are the same). The second time, use every two characters next to each other as the center of a palindrome and try to expand in the same way. Once you got the idea, it is extremely easy to write the code.

public class Solution { public String longestPalindrome(String s) { int max = Integer.MIN_VALUE; String maxstr = s; String tmp; for (int i = 0; i < s.length(); i ++) { tmp = check(s,i); if (tmp.length() > max) { max = tmp.length(); maxstr = tmp; } } for (int i = 0; i < s.length()-1; i ++) { if (s.charAt(i) == s.charAt(i+1)) { tmp = check(s,i,i+1); if (tmp.length() > max) { max = tmp.length(); maxstr = tmp; } } } return maxstr; } public String check(String str, int s, int e) { while (s >= 0 && e < str.length() && str.charAt(s) == str.charAt(e)) { s--; e++; } return str.substring(s+1, e); } public String check(String str, int s) { int a = s-1; int b = s+1; while (a >= 0 && b < str.length() && str.charAt(a) == str.charAt(b)) { a--; b++; } return str.substring(a+1, b); } }

However, the most important lesson of this post is not the solutions to this specific problem; it is that when is necessary to consider using a 2-dimensional DP instead of 1-dimensional DP.

A 1-demensional DP usually collapses information. If the lost information will not be used anymore, it’s totally fine; but if a 1-dimensional array DP solution cannot contain all the information we need, e.g. multiple cases that we need them all to participate in calculation later on. Try multiple-dimensional DP.

The post Leetcode 5. Longest Palindromic Substring appeared first on Qi Xi 西奇.

]]>The post Leetcode 4. Median of Two Sorted Arrays appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/median-of-two-sorted-arrays/

(Aside: Please note that this article mainly focuses on the** thought process**. An idea introduced in the beginning may be completely overturned in later paragraphs, so please be sure the read until the end. Also, this is intended to illustrate the process of deriving an initial solution -> figure out something wrong -> make a fix -> repeat until found a working solution.)

This is actually a very difficult problem. Let’s first gather some initial thoughts from the problem description:

There are two sorted arrays nums1 and nums2 of size m and n respectively.

Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

Example 1:

nums1 = [1, 3]

nums2 = [2]The median is 2.0

Example 2:

nums1 = [1, 2]

nums2 = [3, 4]The median is (2 + 3)/2 = 2.5

A trivial solution is to just merge the two sorted arrays into a new sorted array, and then find the median of the new array. This solution requires O(m+n) time complexity and obviously is not we are looking for.

You may be stuck here for a long time, just like what I did. So why not thinking about what we can get, and what we need to get.We

We have: two sorted arrays.

We can get: their lengths, their values, their min/max, their own medians…

We need to get: the median of the merged array, O(log(m+n)) complexity

What we can do: from the complexity requirement, we can get a sense of recursion + binary search, because the O(log(N)) complexity usually means we need to “cut in half, and recurse on one of the halves”. Let’s just keep this in mind, that we may need to cut the array at some point. We can also get their own medians; because the two arrays are sorted, these may be useful. How can they be used? Not clear yet, so let’s try some examples:

[2,4,5,6] => median 4.5 [7,8] => median 7.5 result array [2,4,5,6,7,8] => median 5.5 [1] => median 1 [2,3,4] => median 3 result array [1,2,3,4] => median 2.5 [1,2] => median 1.5 [3,4] => median 3.5 result array [1,2,3,4] => median 2.5 [99] => median 99 [3,4] => median 3.5 result array [3,4,99] => median 4 [-100,0] => median -50 [10,100,200] => median 100 result array [-100,0,10,100,200] => median 10

As you can easily figure out, the resulting median will always fall between the two medians of the arrays. This is extremely helpful because we have already known we will be cutting the arrays, and this is good criteria. Since the resulting median will always lie between the two arrays’ own medians, we can actually ignore the number outside this range. For example,

[2,4,5,6] => median 4.5 [7,8] => median 7.5 result array [2,4,5,6,7,8] => median 5.5 ignore numbers < 4.5 or > 7.5 [5,6,7] => median 6 => should be 5.5 [1] => median 1 [2,3,4] => median 3 result array [1,2,3,4] => median 2.5 ignore numbers < 1 or > 3 [1,2,3] => median 2 => should be 2.5 [1,2] => median 1.5 [3,4] => median 3.5 result array [1,2,3,4] => median 2.5 ignore numbers < 1.5 or > 3.5 [2,3] => median 2.5 => should be 2.5 [99] => median 99 [3,4] => median 3.5 result array [3,4,99] => median 4 ignore numbers < 3.5 or > 99 [4,99] median => 51.5 => should be 4 [-100,0] => median -50 [10,100,200] => median 100 result array [-100,0,10,100,200] => median 10 ignore numbers < -50 or > 100 [0,10,100] => median 10 => should be 10

We can see that our assumption is partly right: the correct median number indeed falls between the subarray constrained by the two medians, however, it is not the median of the subarray.

What does this mean? Recall that we have known we will be doing something like “cut in half, and recurse on one of the halves”, so it basically means that, we are right at the “cut” part. We have found out what elements to cut off each time, but we cannot simply recurse the “find median of two sorted arrays” on the remaining arrays, we need to make some changes.

Say it in another way, fro the arrays [2,4,5,6] and [7,8], we have found out what to cut off

[ 2 , 4 , 5 , 6 ] => median 4.5 [ 7 , 8 ] => median 7.5 This is where the resulting median could be within: [ 5 , 6 ] [ 7 ] but, we can not simply do recursion on array [5,6] and [7]

What is the problem? The median of the two original arrays should be calculated from number 5 and 6, but the median of the arrays after cut-off is calculated from 6 only. It is easy to note that this is because we have removed 2 elements before “5”, but only 1 element after “7”. Here we know the cause, we can deal with it by taking the number of elements into consideration.

Actually, we can immediately find that it will be great if we can know the number’s position in the merged sorted array. However, it is not necessary, because we are only looking for the median in the merged sorted array; that is, we are only interested in the middle one or two numbers in the merged array.

It is a new approach. It’s a good time to rethink what we have got.

We now realized that we are just looking for

arr[length / 2] in the merged sorted array, if length is odd (arr[length / 2 - 1] + arr[length / 2]) / 2.0 in the merged sorted array, if length is even

It’s now a “find Kth number in two sorted arrays” problem!

We can still apply the “binary search” approach, but this time what are the criteria to be cut off?

find the 4th element in the sorted arrays 4/2 = 1; [ 2, 4, 5, 6 ] => 2nd element is 4 [ 7, 8 ] => 2nd element is 8 result [ 2, 4, 5, 6, 7, 8 ] => 4th element is 6

Similarly, if the (k/2)th element of the first array is smaller than the (k/2)th element of the second array, that means …..4…..8….. for the above example, clearly the 4th element is never possible to be before 4. So this time, we need to compare the (k/2)th element in each array, for the smaller one, cut all the numbers even smaller than the (k/2)th element, and this time we can do recursion, but in the recursion call, the “k” we will be looking for should be “k subtracted by the number of elements we just cut off”.

Now we have the idea. It is time to write some codes, and deal with edge cases. There are some edge cases to be aware of; there are more than one ways to deal with them. I’ll be providing just one way.

The base condition of the recursion calls should be either the array is empty or k is 1. If either array of empty, we just need to return the kth element in the other array; if k is 1, we are just looking for the first element in the sorted arrays, and it should be the smaller on the first element of the two arrays.

If in the recursion call, if the length of an array is smaller than (k/2), what should we do? For example,

find the 4th element in the sorted arrays 4/2 = 2; [ 10 ] => 2nd element is ? [ 7, 8, 9 ] => 2nd element is 8 result [ 7, 8, 9, 10 ] => 4th element is 10 find the 4th element in the sorted arrays 4/2 = 2; [ 2 ] => 2nd element is ? [ 7, 8, 9 ] => 2nd element is 8 result [ 2, 7, 8, 9 ] => 4th element is 9

We can find out that if an array has the length smaller than k/2, we cannot cut anything off it. So we have to keep it as is. There are many ways to achieve this max integer value cooperated with integer comparison.

Now here is the final code. I did not write the code correctly the first time. Instead, I have tried my times, I have to fix it each time it fails some corner tests.

It is still not the optimal solution, and it is very common that you still do not understand the algorithm completely. There are some additional readings I recommend:

https://discuss.leetcode.com/topic/4996/share-my-o-log-min-m-n-solution-with-explanation

public class Solution { public double findMedianSortedArrays(int[] nums1, int[] nums2) { int length = nums1.length + nums2.length; if (length % 2 == 1) { return findKth(nums1, nums2, length/2 + 1); } else { return (findKth(nums1, nums2, length/2) + findKth(nums1, nums2, length/2 + 1)) / 2.0; } } public double findKth(int[] nums1, int[] nums2, int k) { if (nums1.length == 0) { return nums2[k-1]; } if (nums2.length == 0) { return nums1[k-1]; } if (k == 1) { return Math.min(nums1[0], nums2[0]); } double num1, num2; if (nums1.length >= k/2) { num1 = nums1[k/2-1]; } else { num1 = Integer.MAX_VALUE; } if (nums2.length >= k/2) { num2 = nums2[k/2-1]; } else { num2 = Integer.MAX_VALUE; } if (num1 < num2) { return findKth(Arrays.copyOfRange(nums1, k/2, nums1.length), nums2, k - k/2); } else { return findKth(nums1, Arrays.copyOfRange(nums2, k/2, nums2.length), k - k/2); } } }

- For log(N) complexity, consider recursion and binary search.
- Converting a problem into another problem sometimes helps. If stuck, try to abstract it or convert it.
- Eliminating elements that are not possible is a great way to reduce the search space.

The post Leetcode 4. Median of Two Sorted Arrays appeared first on Qi Xi 西奇.

]]>The post Leetcode 3. Longest Substring Without Repeating Characters appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/longest-substring-without-repeating-characters/

For questions asking about longest, largest, etc., we may try to think about Dynamic Programming and Sliding Window. In this case, we can try both on a piece of paper and will find out that sliding window should be the approach. If you are not familiar with sliding window algorithm, you can imagine a window formed by a starting point and ending point on an axis. They both can move freely but the starting point has to be on the left-hand side of (smaller than) the ending point. The sliding window algorithm is basically to move end point to a situation where it can no longer expand, then move starting point gradually towards to ending point. Depending on what you need, you may either want to try to move ending point further each time moving starting point or move starting point towards ending point to a situation the “window” can no longer shrink, then move ending point further again.

So sliding window algorithm fits this problem best. The idea is intuitively simple: move ending point as far as possible, when it reaches the first character that is is a repeated character in the substring formed by starting and ending point, move starting point right to where just enough to exclude the repeated character, then try to expand the window again by moving ending point, when it encounters a repeated character again, just do the same thing.

Based on the description of the algorithm, we can see that, the basic structure: a while loop to move from index 0 to the end of string, and an if statement for the situation that it encounters a repeated character, and inside the if condition we need to move starting point to the next position of the repeated character. Every time we expand (moving ending point) we need to record the current max length of substring; we don’t have to do this for moving starting point because each time we move starting point, it is just shrinking the window.

Another question arises, how do we check if there is a repeated character in the substring? and how do we move starting point right just enough to exclude the repeated character?

It is review time! Of course, we can use a loop to do this, but this is so stupid. If you have read this post: https://www.imxiqi.com/leetcode-1-two-sum/ the best solution is obvious, a hash map!

We will again create a hash map while expanding the window and use the character key, use its index as the of hash map entry value.

Everything should be clear now except one thing: when we move starting point, what about the hash map? We can actually just delete the corresponding entry in the hash map, but because we will directly move starting point to the next position of the repeated character, doing this will require a loop. Seems not efficient. Let’s figure out how to get rid of the hash map entries before the new starting position. Hey, this seems to answer the questions already: we just need to get rid of the entries with the value (index) smaller than the starting position. We don’t necessarily need to remove those, instead, we just need to add a condition in our if statement.

We have seen that the previous takeaway is just used again! Thus it is wise that we use the takeaway from https://www.imxiqi.com/leetcode-2-add-two-numbers/ as well, creating some tests before coding.

Let’s try these:

"aaaa" => 1 "abc" => 3 "abca" => 3 "abcad" => 4 "" => 0 "abbbbbdefbbbbbc" => 4

Okay, if we just hand-run the algorithm against these test cases, we didn’t find any corner cases we should be aware of so far. Let’s code.

When you try to start coding, a new question may come up, what should be the while condition? Should I check both starting point and ending point or just ending point? In this case, because we are looking for the longest and move starting point only shrinks the window, we don’t have to care about the starting point.

public class Solution { public int lengthOfLongestSubstring(String s) { HashMap<Character, Integer> map = new HashMap<Character, Integer>(); int start = 0; int end = 0; int max = 0; while (end < s.length()) { if (map.containsKey(s.charAt(end)) && map.get(s.charAt(end)) >= start) { start = map.get(s.charAt(end)) + 1; } else { map.put(s.charAt(end), end); end ++; max = Math.max(max, end-start); } } return max; } }

- Use previous takeaways!
- For a question asking about longest, largest subarray or something similar, try Sliding Window algorithm.
- Hash Table is a good friend
- To get rid of a lot of things, sometimes we do not need actually to remove them, adding an if condition may be enough.

The post Leetcode 3. Longest Substring Without Repeating Characters appeared first on Qi Xi 西奇.

]]>The post Leetcode 2. Add Two Numbers appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/add-two-numbers/

The solution is obvious that just to go through the two linked lists at the same time, add the number and make the sum a new node into the result. However, this can be a very buggy program if not careful.

This time, we will try to find corner cases using test-driven development. So let’s create tests first.

When creating tests, we must come up all possible situations that may break the code. In industry production, we must also be careful of invalid or meaningless or harmful inputs (e.g. SQL injection), but we don’t have to worry about them here.

Before creating tests, let’s take another look at the input requirement: each input is a linked list of integers. During the interview, it is better to ask the interviewer if we can assume the input is valid, which in this case, yes, so we can assume that each linked list node is an integer from 0 to 9. When creating tests cases, we can hand calculate the expected outputs as well unless the algorithm is extremely complex.

Some tests and expected outputs can be:

[0] + [0] => [0] [1] + [1] => [2] [0] + [9] => [9] [1] + [9] => [0, 1] [9] + [9] => [8, 1] [9, 1] + [1] => [0, 2] [9, 1] + [9, 1] => [8, 3] [9, 9] + [1] => [0, 0, 1] [9, 9] + [9, 9] => [8, 9, 1]

It is highly recommended that you assume you were a computer and using the algorithm to calculate the expected results, rather than just add numbers and write the expected result in linked list representation. When hand-calculating the results, you may already find some corners that bugs may exist! Let’s group these cases by our findings (in an interview, we can jog down these findings while creating tests):

- There may be a carry: if the two digits’ sum is >= 10, there is a carry.
- The previous carry may generate another carry: the two digit’s sum may not generate a carry, but after we add it with the previous carry, it becomes >= 10. For example, the current two digits are 4 and 5, and there is a carry of 2.
- If a linked list is longer, the digits who does not have another digit to pair with in another linked list may still need to deal with carry.

Thus far, we didn’t come up any better solutions than iterating through the two linked lists at the same time (we can actually just find get the numbers represented by the two linked lists, add them up and convert the result into linked list representation, but this is too stupid. We want beautiful and elegant code!), so we need to how to deal with these corner cases in our algorithm.

Obviously, we will be using while loops, so what should be the condition of the while loop? According to our findings, we found out that, everything single digit in the two linked lists needs to be checked for these:

- If there is a digit at the current position of another linked list, add them.
- If there is a carry from previous, add carry
- if the sum generates a carry, pass the carry.

So the condition of the while needs to be true if there is any digit left in the two linked lists. Okay, we have actually had the skeleton of the algorithm. However, there is one more thing to be aware of. After the while loop reaches its end, that is every single digit of the two linked lists have been checked, there may still be a carry left! In this case, we have to add another node then.

Yep, we have got almost what we need so far. Our code will looks very similar to our findings above. It reminds me a quote from my Software Engineering professor Tracy Lewis-Williams, “If you create tests first, you will find out that your actual code looks very similar to your tests. This is test-driven development”.

Let’s take a look at the code now:

/** * Definition for singly-linked list. * public class ListNode { * int val; * ListNode next; * ListNode(int x) { val = x; } * } */ public class Solution { public ListNode addTwoNumbers(ListNode l1, ListNode l2) { ListNode cur1 = l1; ListNode cur2 = l2; int carry = 0; ListNode dummy = new ListNode(0); ListNode cur = dummy; while (cur1 != null || cur2 != null) { int tmp = carry; if (cur1 != null) { tmp = tmp + cur1.val; cur1 = cur1.next; } if (cur2 != null) { tmp = tmp + cur2.val; cur2 = cur2.next; } cur.next = new ListNode(tmp % 10); carry = tmp / 10; cur = cur.next; } if (carry != 0) { cur.next = new ListNode(carry); } return dummy.next; } }

- Test-driven: create tests and let tests help find out corner cases

The post Leetcode 2. Add Two Numbers appeared first on Qi Xi 西奇.

]]>The post Leetcode 1. Two Sum appeared first on Qi Xi 西奇.

]]>https://leetcode.com/problems/two-sum/

The brute force solution is so obvious and trivial; the main focus is to find out an elegant way.

Let’s start by analyzing the brute force approach: two nested loops; the outer loop iterates from index 0 to the end, and the inner loop iterates from the next index of outer loop to the end. We check if they can sum to the target then. An O(n^2) solution. At each inner loop step, we are actually looking for a number that can be summed up to the target with outer loop number. This is where we may try something to improve.

So, we need a way to find out whether there is a number in the array. This easily leads us to think about using a HashTable. If we can map the array into a hash table, using the **array value** as key, and the array index as value, then we can easily tell if a value exists in the array by checking if the hash table contains the key. The original inner loop is O(N), and the hash table approach reduces it to O(1) because we do not need an inner loop anymore. Implementing a hash table needs us to loop over the array; this is inevitable, thus the entire program is now O(N).

Now we have the basic idea, let’s write the code.

While coding, you may encounter a question: should we use a loop to map the entire array first and then use a loop to check? Can we check while mapping? Will this lose any solution?

Let’s look at the problem description, it does not require the answer index to be ordered, that means [i,j] and [j,i] is the same. If we check if the other number exists while mapping the first number, we may not find the answer at the first time because the second number is not in the hash map yet, but later on when we are mapping the second number, the first number is already in the hash map. Thus we will not lose any solutions. Actually, if we think deeper, even if order matters, we can easily just change how we return the answer.

Our goal is to write beautiful and elegant code, although the two approaches are the same time complexity, let’s adopt the second approach.

After your initial test, you may find a bug: the order of mapping and checking matters! At each loop, if we map first, then the current number will be considered as a candidate second number. For example, [3,2,4] and the target is 6. If we map 3 and then check if the map containsKey(6-3), we got yes and the answer will be [0,0]. This is obviously not the answer we expect because in the problem description we are not allowed to use the same number twice. During an interview, this is maybe a question we should ask to clarify.

public class Solution { public int[] twoSum(int[] nums, int target) { int[] ret = new int[2]; HashMap<Integer, Integer> map = new HashMap<Integer, Integer>(); for (int i = 0; i < nums.length; i ++) { if (map.containsKey(target - nums[i])) { ret[0] = i; ret[1] = map.get(target - nums[i]); return ret; } map.put(nums[i], i); } return ret; } }

- Need to look up a thing fast? Use hash map!
- Need to use loops over and over again to find an element? Try put everything in a hash map!

The post Leetcode 1. Two Sum appeared first on Qi Xi 西奇.

]]>The post Network Entropy & Information Gain Analyzing Tool (网络传输层熵的分析工具) appeared first on Qi Xi 西奇.

]]>把我写的代码分享一下吧：https://github.com/xiqi/network-entropy-infogain

这个工具可以通过分析传输层（TCP、UDP）的IP地址对（Source/Destination）的熵，来判断应用层的网络活动。

以下是GitHub上的README，具体代码请见上面的GitHub链接：

University of Wisconsin-Madison

Final Project for CS740 Spring 2016

This program calculates entropy and information gain of network packets captured by Wireshark.

You can get Wireshark here for free.

- Python3 (https://www.python.org/downloads/)
- matplotlib (http://matplotlib.org/users/installing.html)

Note on linux users: You may need use `sudo apt-get install python3-matplotlib`

(Debian/Ubuntu) or `sudo yum install python3-matplotlib`

(Fedora/Redhat) to install matplotlib for Python3 instead of Python2.

- Open Wireshark and start capture traffic (you can choose to capture all traffic, or TCP only, UDP only, etc.)
- File -> Save (you have to save it somewhere in order to proceed, but you don’t need this file)
- File -> Export Packet Dissections -> Export as CSV
- Open the CSV file and manually copy the two columns of source and destination IP address into a text file, deleting the first line (description row), and save this file as a
`.txt`

file - Put this file in the input directory, e.g.
`./input/input.txt`

- run the all-in-one script ./run.sh using the input file path as the only argument, e.g.
`./run.py ./input/input.txt`

- It will put four files in the output directory,
`.parse`

,`.entropy`

,`.infogain`

and`.png`

. The`.parse`

file is the formatted packets in binary strings (you don not need look into this file). The`.png`

file is the plotted graph of entropy (It will by default discard the first 10% of entropy data to improve accuracy. You can change this number in calculate.py). Other files are self-explanatory

- Run
`parse.py`

using the file path as the only argument, e.g.`./parse.py ./input/input.txt`

- It will put a
`.parse`

file in the output directory with the same prefix, e.g.`./output/input.txt.parse`

- Run
`calculate.py`

using the`.parse`

file as the only argument, please note that the`.parse`

file in inside the output directory, e.g.`./calculate.py ./output/input.txt.parse`

- It will put three files in the output directory,
`.entropy`

,`.infogain`

and`.png`

. The`.png`

file is the plotted graph of entropy (It will by default discard the first 10% of entropy data to improve accuracy. You can change this number in calculate.py)

- Run
`./run-all.py`

will let the program process all`.txt`

files in the`input`

directory. Please make sure that all`.txt`

files are packet data files in the`input`

directory. This option will not show the plotted graph automatically after processing each file. All output files will be put in the`output`

directory as well.

- run
`./clean-all.sh`

to clean ALL files in input and output directory - run
`./clean-input.sh`

to clean ALL files in the input directory - run
`./clean-output.sh`

to clean ALL files in the output directory

The previous version was to calculate the entropy of the entire packet, while the current version only considers the source/destination IP address pairs. If you would like to use the previous version. just replace `parse.py`

and`calculate.py`

by `parse2.py`

and `calculate2.py`

respectively.

The post Network Entropy & Information Gain Analyzing Tool (网络传输层熵的分析工具) appeared first on Qi Xi 西奇.

]]>The post Heartbleed Detector and Exploiter (Java) appeared first on Qi Xi 西奇.

]]>当时我写了一个简单的Java程序放在了GitHub上，不过由于是作业所以一直设为保密。现在学期已经结束了，不如公开和大家分享。网上也有很多类似的攻击脚本，不过很多都是Python的。

GitHub地址：https://github.com/xiqi/CS642-EC-Heartbleed

这个程序使用很简单，编译后直接执行

java HeartbleedDetect localhost

或

java HeartbleedExploit localhost

localhost可以换成任意IP或者URL，不过需要注意的是，任意扫描其他人的网站和服务器是违法行为。

欢迎提出改进和分享，也请注明出处。

The post Heartbleed Detector and Exploiter (Java) appeared first on Qi Xi 西奇.

]]>The post 针对WordPress采用了Google Fonts而导致国内访问缓慢的解决方法 appeared first on Qi Xi 西奇.

]]>除了从源码中直接删除相关语句，我另外推荐两种方法解决这个问题：

**利用插件禁用Google Fonts**

Disable Google Fonts：这个插件启用之后可以移除前台和后台所有引用了Google Fonts的功能，安装后直接启用即可不需设置。

**采用奇虎（360网站卫士）提供的Google Fonts国内CDN镜像**

详情请见“360网站卫士常用前端公共库CDN服务”，即把 fonts.googleapis.com 替换为 fonts.useso.com 即可。另外除了直接修改源码，有网友写出了一个WordPress插件可以做到同样的功能，请见博文源地址：http://www.soulteary.com/2014/06/08/replace-google-fonts.html 为了方便，也可以点击这里直接下载：Replace-Google-Fonts，安装启用即可不需设置，测试效果表明速度效果十分理想。

The post 针对WordPress采用了Google Fonts而导致国内访问缓慢的解决方法 appeared first on Qi Xi 西奇.

]]>